CSC/ECE 517 Spring 2015/ch1a 14 RI: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
 
(80 intermediate revisions by 2 users not shown)
Line 1: Line 1:
<font size="5"><b>Thread Safety</b></font>
<font size="5"><b>Thread Safety</b></font>


A multi-threaded program presents the risk of race conditions, which are situations in which multiple threads rely on the same system state. This is a dangerous situation because if multiple threads access the same state simultaneously, the state could become corrupted. Thread safety avoids race conditions by guaranteeing that multiple threads can run safely, concurrently.
[https://docs.google.com/document/d/1TgBtp7flIPKJwkkShgtcIkt--mtHuwVHsQX6Tpzj1rc/edit Writeup Page] A [http://en.wikipedia.org/wiki/Thread_(computing)#Multithreading multi-threaded] program presents the risk of [http://en.wikipedia.org/wiki/Race_condition race conditions], which are situations in which multiple [http://en.wikipedia.org/wiki/Thread_(computing) threads] rely on the same system state. This is a dangerous situation because if multiple threads access the same state simultaneously, the state could become corrupted. Thread safety avoids race conditions by guaranteeing that multiple threads can run safely, concurrently.


__TOC__
__TOC__


== Background ==
== Background ==
[[File:Multithread.png|right]]


A thread is a small set of instructions that can be handled independently by the scheduler. Multithreading is the concept of allowing multiple threads within the execution of a single process, which allows those threads to share resources. Multithreading is a useful and powerful construct within web-server environments, such as rails. For example, multiple threads could be used to handle requests from multiple users simultaneously, while also handling long running tasks in the background. As a result of multithreading, users have a better experience, because tasks that would pause execution in a single threaded application, can be allotted their own thread, and the user is never interrupted. Although having multiple threads is powerful, it is equally dangerous, due to race conditions. Race conditions occur when multiple threads are accessing the same state, most specifically the same data, and thus the output depends on the correct order of operations, but the threads execute out of order, causing the data to be corrupted.
A thread is a small set of instructions that can be handled independently by the scheduler. Multithreading is the concept of allowing multiple threads within the execution of a single process, which allows those threads to share resources. Multithreading is a useful and powerful construct within web-server environments, such as rails.<ref name="Thread (Computing)">http://en.wikipedia.org/wiki/Thread_(computing)</ref> For example, multiple threads could be used to handle requests from multiple users simultaneously, while also handling long running tasks in the background. As a result of multithreading, users have a better experience, because tasks that would pause execution in a single threaded application, can be allotted their own thread, and the user is never interrupted. Although having multiple threads is powerful, it is equally dangerous, due to race conditions. Race conditions occur when multiple threads are accessing the same state, most specifically the same data, and thus the output depends on the correct order of operations, but the threads execute out of order, causing the data to be corrupted. Thus, thread safety ensures that when multiple threads alter shared components, they do so in a guaranteed safe manner, therefore there will be no race conditions within a thread safe application.<ref name="Race Condition">http://en.wikipedia.org/wiki/Race_condition</ref><ref name="Thread Safety">http://en.wikipedia.org/wiki/Thread_safety</ref>
 
== Race Condition Example ==
 
The following is example of code that, if included in a non thread safe context, could lead to race conditions:
 
<pre>
1: def raceCondition (param, check)
2:  if param.eql?(check)
3:      param += 2
4:  end
5: end
</pre>
 
Line 1 is simply defining the method and taking in two parameters: param and check. Line 2 then checks whether or not the value of param is equal to the value of check. If they are equal, 2 is added to the value of param on Line 3. If this method construct still seems a bit confusing, please refer to [https://www.ruby-lang.org/en/documentation/quickstart/ Ruby Introduction Guide] for further familiarization with Ruby programming language.
 
In terms of multithreading, this method introduces a possible race condition because of its read/update nature on Lines 2 and 3. Imagine that multiple threads were ran simultaneously. If Thread B changes the value of param while Thread A is between Line 2 and Line 3, then param variable will not receive the anticipated value (check + 2) when Thread A reaches Line 3. Instead, variable param would receive a corrupted value. In a critical environment, a race condition like this could have serious negative effects on a production database, which is why having thread-safe applications is of utmost importance in a multithreaded environment.
 
== History ==
 
Multithreading is not a new concept to Rails. Previous versions of Rails were capable of supporting multithreading in the applications very effectively. In the previous versions of Rails, once an application template was created, a programmer could find a method named "threadsafe!" in the configurations and enable it for multithreaded applications. This option used to be located in the production.rb file located on the generic path config/environments/production.rb. Therefore, this option for multithreading was automatically disabled (commented out in the configuration file) in the previous versions of Rails and the programmers always had to manually enable it to make their application thread-safe.<ref name="What is config.threadsafe!">http://www.sitepoint.com/config-threadsafe/</ref>
 
== Manually Enabling Multithreading ==
 
In order to manually enable multitreading, in versions of Ralis prior to 4.0, the following source code of threadsafe! method simply needed to be un-commented out:
 
[[File:threadsafe.png]]
 
We can see that the threadsafe! method consists of only 4 options.<ref name="Removing config.threadsafe!">http://tenderlovemaking.com/2012/06/18/removing-config-threadsafe.html</ref> The first option @preload_frameworks describes how the application is loaded and ensures that the entire framework is preloaded instead of relying on auto-loading behavior which is not thread-safe. The second option @cache_classes tells Rails not to automatically reload classes when they are modified so not to waste resources. The third option @dependency_loading prevents Rails from loading code when missing constants are discovered, which is not thread safe. The last option is set to true and it tells Rails not to use Rack::Lock middleware, which ensures that only one thread executes at a time (synchronization). The Rack::Lock middleware is how previous versions of Rails guaranteed thread safety, due to synchronization.<ref name="Rails On Rack">http://guides.rubyonrails.org/rails_on_rack.html</ref>
 
== Rack::Lock ==
 
Within Rails there is middleware called [http://en.wikipedia.org/wiki/Rack_(web_server_interface) Rack], which essentially provides a uniform way for Ruby and Rails to work with web servers and frameworks such as Apache, Thin, etc. Within Rack there is an environment property rack.multithread. The Rack::Lock piece of Rack disables the rack.multithread flag, which means that the application will be allowed to serve multiple threads. When Rack::Lock is enabled within Rails (when threadsafe! is disabled), the application is only allowed one thread at a time.<ref name= "Configuring Rails Applications">http://guides.rubyonrails.org/configuring.html</ref> This serialization does ensure thread safety, but it does so at the expense of run time, because if only one thread can execute at a time, then the other threads must wait for every thread before it to finish. This defeats the purpose of a threaded webserver, which is a necessity in today's Web.
 
== Examples of threadsafe! use ==
 
In order to give insight into the functionality of multithreading to the students that are new to the concept of multithreading, and because students learn the best by observing and studying practical examples of learning concepts, two examples of threadsafe! use is explained in the next two subsections.
 
=== 1. Speed of execution ===
 
Whether the threadsafe! configuration option has been enabled or not can easily be established by simply observing the execution flow of a program. If there is a noticeable lag between the instructions that should be executed simultaneously, we will have a clue that multithreading has not been achieved in the program. To further explain this approach, we provide a practical example.
 
We can enable threadsafe! configuration option in production mode and then create a controller with a name FooController:
 
[[File:FooController_example1.png]]
 
 
We can now start the rails server in development mode (note that threadsafe! configuration option has been enabled only in the production mode). Then, in a separate tab, we can create a curl request to the foo's application bar action, repeat it 5 times and use ampersand (&) to fork the processes and have them all triggered simultaneously:
 
<pre>
repeat 5 (curl http://localhost:3000/foo/bar &)
</pre>
 
The result: We will get 5 "foobar" strings printed to the screen, but they will be printed one at a time with 1 second delay between each (as a result of "sleep 1" instruction in the bar method definition). Therefore, the 5 requests are being processed separately even though they were all made at the same time. This tells us that the processes are not executed simultaneously and that multithreading is not implemented.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref>
 
Now, we can stop the rails server and run it again, but this time in the production environment (where we enabled threadsafe! configuration option):
<pre>rails s -e production</pre>
 
Once we run the 5 curl request command as in previous scenario:
<pre>repeat 5 (curl http://localhost:3000/foo/bar &)</pre>
we will see the "foobar" strings being outputted nearly simultaneously.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref> Since there is no 1s delay between the "foobar" outputs, which is a part of the bar method definition, it becomes clear that all 5 requests are being processed synchronously, hence proving the functionality of multithreading.
 
=== 2. Memory access ===
 
Still not convinced that enabling the threadsafe! configuration option effectively introduces multithreading into a rails application? Lets prove this point again, but with a slightly different approach. In this example, we show the presence of multithreading by observing the state of a memory location and the value it holds during the program execution. Lets edit our FooController from the previous example:
 
[[File:FooController_ex2.png]]
 
In this modified FooController, we initialized a class variable counter to 0 and with each bar method execution, this variable is read and stored in a local counter variable. It is then incremented and the new value is stored back in the class variable counter. Finally, at the end of bar method, the value of class variable counter is displayed.
 
Again, we will first start the server in the development mode and observe the behavior of this controller in the single-threaded environment. Once we run the 5 curl requests as in the previous example:
<pre>
repeat 5 (curl http://localhost:3000/foo/bar &)
</pre>
 
the result will be
 
<pre>
1
2
3
4
5
</pre>
 
printed with 1 second delay in between.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref> Now, we will stop the rails server and run it again, but this time in the production environment, where we enabled the threadsafe! configuration option. This time, when we run the 5 curl requests, the result will be:
 
<pre>
1
1
1
1
1
</pre>
 
and all the lines will be printed nearly simultaneously.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref> In this scenario, all the requests are being processed concurrently and there is no time for the counter to be incremented before the next request is processed. This behavior points out a very important symptom in multithreading programs that has been mentioned at the beginning of this article: The developers have to be very careful and pay attention to the flow of a multithreaded system in order to avoid the corruption of data. For example, in a case where numerous threads are accessing and modifying the same variable concurrently, problems can arise if one thread attempts to read a variable at the same time as the other thread is changing it. This would result in a garbage value being returned by the reading thread. To solve this problem, a mutex can be used to lock certain variable and have a single thread hold the key to it at a time.
 
== Rails 4.0 ==
 
Beginning with Rails version 4.0 multithreading has been automatically enabled and supported by the initial configuration.<ref name="Rails 4.0: Final version released!">http://weblog.rubyonrails.org/2013/6/25/Rails-4-0-final/</ref> The threadsafe! method has been removed from the configuration file and it cannot be found in the production.rb file anymore. This enables Rails 4.0 to automatically provide thread safety to all of its applications. The authors of this wiki page believe that the main reason for this approach has been to make learning rails more beginner-friendly and eliminate corruption of Rails applications written by inexperienced programmers.
 
== References ==
<references/>

Latest revision as of 02:59, 9 February 2015

Thread Safety

Writeup Page A multi-threaded program presents the risk of race conditions, which are situations in which multiple threads rely on the same system state. This is a dangerous situation because if multiple threads access the same state simultaneously, the state could become corrupted. Thread safety avoids race conditions by guaranteeing that multiple threads can run safely, concurrently.

Background

A thread is a small set of instructions that can be handled independently by the scheduler. Multithreading is the concept of allowing multiple threads within the execution of a single process, which allows those threads to share resources. Multithreading is a useful and powerful construct within web-server environments, such as rails.<ref name="Thread (Computing)">http://en.wikipedia.org/wiki/Thread_(computing)</ref> For example, multiple threads could be used to handle requests from multiple users simultaneously, while also handling long running tasks in the background. As a result of multithreading, users have a better experience, because tasks that would pause execution in a single threaded application, can be allotted their own thread, and the user is never interrupted. Although having multiple threads is powerful, it is equally dangerous, due to race conditions. Race conditions occur when multiple threads are accessing the same state, most specifically the same data, and thus the output depends on the correct order of operations, but the threads execute out of order, causing the data to be corrupted. Thus, thread safety ensures that when multiple threads alter shared components, they do so in a guaranteed safe manner, therefore there will be no race conditions within a thread safe application.<ref name="Race Condition">http://en.wikipedia.org/wiki/Race_condition</ref><ref name="Thread Safety">http://en.wikipedia.org/wiki/Thread_safety</ref>

Race Condition Example

The following is example of code that, if included in a non thread safe context, could lead to race conditions:

1: def raceCondition (param, check)
2:   if param.eql?(check)
3:      param += 2
4:   end
5: end

Line 1 is simply defining the method and taking in two parameters: param and check. Line 2 then checks whether or not the value of param is equal to the value of check. If they are equal, 2 is added to the value of param on Line 3. If this method construct still seems a bit confusing, please refer to Ruby Introduction Guide for further familiarization with Ruby programming language.

In terms of multithreading, this method introduces a possible race condition because of its read/update nature on Lines 2 and 3. Imagine that multiple threads were ran simultaneously. If Thread B changes the value of param while Thread A is between Line 2 and Line 3, then param variable will not receive the anticipated value (check + 2) when Thread A reaches Line 3. Instead, variable param would receive a corrupted value. In a critical environment, a race condition like this could have serious negative effects on a production database, which is why having thread-safe applications is of utmost importance in a multithreaded environment.

History

Multithreading is not a new concept to Rails. Previous versions of Rails were capable of supporting multithreading in the applications very effectively. In the previous versions of Rails, once an application template was created, a programmer could find a method named "threadsafe!" in the configurations and enable it for multithreaded applications. This option used to be located in the production.rb file located on the generic path config/environments/production.rb. Therefore, this option for multithreading was automatically disabled (commented out in the configuration file) in the previous versions of Rails and the programmers always had to manually enable it to make their application thread-safe.<ref name="What is config.threadsafe!">http://www.sitepoint.com/config-threadsafe/</ref>

Manually Enabling Multithreading

In order to manually enable multitreading, in versions of Ralis prior to 4.0, the following source code of threadsafe! method simply needed to be un-commented out:


We can see that the threadsafe! method consists of only 4 options.<ref name="Removing config.threadsafe!">http://tenderlovemaking.com/2012/06/18/removing-config-threadsafe.html</ref> The first option @preload_frameworks describes how the application is loaded and ensures that the entire framework is preloaded instead of relying on auto-loading behavior which is not thread-safe. The second option @cache_classes tells Rails not to automatically reload classes when they are modified so not to waste resources. The third option @dependency_loading prevents Rails from loading code when missing constants are discovered, which is not thread safe. The last option is set to true and it tells Rails not to use Rack::Lock middleware, which ensures that only one thread executes at a time (synchronization). The Rack::Lock middleware is how previous versions of Rails guaranteed thread safety, due to synchronization.<ref name="Rails On Rack">http://guides.rubyonrails.org/rails_on_rack.html</ref>

Rack::Lock

Within Rails there is middleware called Rack, which essentially provides a uniform way for Ruby and Rails to work with web servers and frameworks such as Apache, Thin, etc. Within Rack there is an environment property rack.multithread. The Rack::Lock piece of Rack disables the rack.multithread flag, which means that the application will be allowed to serve multiple threads. When Rack::Lock is enabled within Rails (when threadsafe! is disabled), the application is only allowed one thread at a time.<ref name= "Configuring Rails Applications">http://guides.rubyonrails.org/configuring.html</ref> This serialization does ensure thread safety, but it does so at the expense of run time, because if only one thread can execute at a time, then the other threads must wait for every thread before it to finish. This defeats the purpose of a threaded webserver, which is a necessity in today's Web.

Examples of threadsafe! use

In order to give insight into the functionality of multithreading to the students that are new to the concept of multithreading, and because students learn the best by observing and studying practical examples of learning concepts, two examples of threadsafe! use is explained in the next two subsections.

1. Speed of execution

Whether the threadsafe! configuration option has been enabled or not can easily be established by simply observing the execution flow of a program. If there is a noticeable lag between the instructions that should be executed simultaneously, we will have a clue that multithreading has not been achieved in the program. To further explain this approach, we provide a practical example.

We can enable threadsafe! configuration option in production mode and then create a controller with a name FooController:


We can now start the rails server in development mode (note that threadsafe! configuration option has been enabled only in the production mode). Then, in a separate tab, we can create a curl request to the foo's application bar action, repeat it 5 times and use ampersand (&) to fork the processes and have them all triggered simultaneously:

repeat 5 (curl http://localhost:3000/foo/bar &)

The result: We will get 5 "foobar" strings printed to the screen, but they will be printed one at a time with 1 second delay between each (as a result of "sleep 1" instruction in the bar method definition). Therefore, the 5 requests are being processed separately even though they were all made at the same time. This tells us that the processes are not executed simultaneously and that multithreading is not implemented.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref>

Now, we can stop the rails server and run it again, but this time in the production environment (where we enabled threadsafe! configuration option):

rails s -e production

Once we run the 5 curl request command as in previous scenario:

repeat 5 (curl http://localhost:3000/foo/bar &)

we will see the "foobar" strings being outputted nearly simultaneously.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref> Since there is no 1s delay between the "foobar" outputs, which is a part of the bar method definition, it becomes clear that all 5 requests are being processed synchronously, hence proving the functionality of multithreading.

2. Memory access

Still not convinced that enabling the threadsafe! configuration option effectively introduces multithreading into a rails application? Lets prove this point again, but with a slightly different approach. In this example, we show the presence of multithreading by observing the state of a memory location and the value it holds during the program execution. Lets edit our FooController from the previous example:

In this modified FooController, we initialized a class variable counter to 0 and with each bar method execution, this variable is read and stored in a local counter variable. It is then incremented and the new value is stored back in the class variable counter. Finally, at the end of bar method, the value of class variable counter is displayed.

Again, we will first start the server in the development mode and observe the behavior of this controller in the single-threaded environment. Once we run the 5 curl requests as in the previous example:

repeat 5 (curl http://localhost:3000/foo/bar &)

the result will be

1
2
3
4
5

printed with 1 second delay in between.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref> Now, we will stop the rails server and run it again, but this time in the production environment, where we enabled the threadsafe! configuration option. This time, when we run the 5 curl requests, the result will be:

1
1
1
1
1

and all the lines will be printed nearly simultaneously.<ref name= "Thread-Safety">http://railscasts.com/episodes/365-thread-safety</ref> In this scenario, all the requests are being processed concurrently and there is no time for the counter to be incremented before the next request is processed. This behavior points out a very important symptom in multithreading programs that has been mentioned at the beginning of this article: The developers have to be very careful and pay attention to the flow of a multithreaded system in order to avoid the corruption of data. For example, in a case where numerous threads are accessing and modifying the same variable concurrently, problems can arise if one thread attempts to read a variable at the same time as the other thread is changing it. This would result in a garbage value being returned by the reading thread. To solve this problem, a mutex can be used to lock certain variable and have a single thread hold the key to it at a time.

Rails 4.0

Beginning with Rails version 4.0 multithreading has been automatically enabled and supported by the initial configuration.<ref name="Rails 4.0: Final version released!">http://weblog.rubyonrails.org/2013/6/25/Rails-4-0-final/</ref> The threadsafe! method has been removed from the configuration file and it cannot be found in the production.rb file anymore. This enables Rails 4.0 to automatically provide thread safety to all of its applications. The authors of this wiki page believe that the main reason for this approach has been to make learning rails more beginner-friendly and eliminate corruption of Rails applications written by inexperienced programmers.

References

<references/>