CSC/ECE 517 Summer 2008/wiki1 4 wm

From Expertiza_Wiki
Jump to navigation Jump to search

Introduction

A thread is a basic unit of CPU utilization. A traditional process has a single thread of control. Many modern operating systems provide features enabling a process to contain multiple threads of control. If the process has multiple threads of control, it can do more than one task at a time, such as, displaying graphics and reading keystrokes. Threads are used mainly to run asynchronous tasks and pass the application’s tasks to an executor. They are useful because they reduce the overall time of execution of programs. All threads belonging to the same process share its code section, data section and other operating system resources, such as open files and signals. Thus multithreading is more efficient than having parallel processes running for the same program, which require a huge overhead.

Multi-threaded Programming

Multithreading support for single processors works by giving each of the threads a “time slice” of the CPU, similar to the parallel execution of processes. On multi-core machines, different threads can run on the different processors and the threads truly run simultaneously. Support for threads may be provided at either the user level, for user or green threads, or by the kernel, for kernel or native threads. User threads are supported above the kernel and are managed without kernel support, whereas kernel threads are supported and managed directly by the operating system.

Multi-threading Models

There are three common multithreading models

One-to-One

This model maps each user-thread to a kernel thread. It provides more concurrency than the many-to-one model by allowing another thread to run when a thread makes a blocking system call. It also allows multiple threads to run in parallel on multiprocessors. The main disadvantage of this model is the overhead of creating each kernel thread for each user thread. This burdens the performance of an application and limits the number of threads supported by the system.

Many-to-One

This model maps many user-level threads to one kernel thread. It is the model used by green threads. Green threads are scheduled by a Virtual Machine instead of natively by the OS. They emulate multithreaded environments without relying on any native OS capabilities. These user-level threads are lightweight and very efficient. They are an easy way to achieve parallelism in a program. An application can have as many user-level threads as it needs, but true concurrency is not achieved because the kernel can schedule only one thread at a time. The developers can create as many user threads as necessary. However, there are two main disadvantages:

  1. If a thread makes a blocking system call, the entire process will block.
  2. If you are running on a multi-core machine, the multiple threads are unable to run in parallel.

Many-to-Many

This model multiplexes many user-level threads to a smaller or equal number of kernel threads. This model has the best of both worlds. Developers can create as many user threads as they need. When a user thread performs a blocking system call, the kernel can schedule another thread for execution and the kernel threads can run in parallel on multi-core systems.

Benefits of Multi-threading

There are many benefits of multithreaded programming, including

Resource sharing

Threads share memory, code and process resources. Allocating memory and resources for process creation is costly and time consuming. It is more economical to create threads than new processes. If there are several different threads of activity within the same address space, applications can benefit.

Responsiveness

If an interactive application is multithreaded, this may allow it to continue running, even if part of it is blocked doing IO or performing a lengthy operation. This will increase responsiveness to the user. One thread of a web browser can display images or text, while another thread retrieves data from the network. Many clients can concurrently access a busy web server. If the web server ran as a traditional single-threaded process, it could service only one client at a time.

Potential Problems of Multi-threading

Many threaded programs have shared resources. More than one thread may need to access these resources. These programs must implement a form of mutual exclusion to ensure that only one of the threads gets access to the resource at a time. Access to these shared resources must execute in code called critical sections. Use hardware or software implementations to protect these critical sections.

Java Threads

The Java programming language has a thread library for creating and managing threads at the user-level. The Java virtual machine (JVM) manages the mapping of the user threads to the operating system kernel threads. The particular mapping model in use depends on the operating system on which the Java program runs. Java’s thread scheduler monitors all the threads running in Java programs. It decides which threads to run at any given time, and when to switch between threads based on the thread’s priority. The Java scheduler uses either preemptive or non-preemptive scheduling based on the operating system. With preemptive scheduling, the scheduler gives each thread a constant period of time to run, after which time the thread is suspended() to allow the next thread to resume() running. With non-preemptive scheduling, the running thread runs until the thread completes or until it issues a yield() to allow other threads to run. To create a Java thread, either have your Java class extend the Java Thread class, or have your Java class implement the Runnable interface and code a run() method. The two options are shown in Figures 1 and 2. The preferred method for creating a thread is to implement the Runnable interface. It is more flexible and useful in complex applications. Java does not support multiple inheritance, but a class may implement multiple interfaces. By using the interface method, the Client class is open to inherit another class, if the need arose in the future.

Figure 1: Client Extends Thread Class

   public class Client extends Thread {
       public void run() {
           System.out.println(“Client thread”);
       }
   }
   public class Server {
       public static void main(String args[]){
           Thread runner = new Client();
           runner.start();
           System.out.println(“Server thread”);
       }
   }

When the Java code executes the Server class main method in Figure 1, it instantiates a new Client, which is a worker thread. The Client code inside the run() method executes when the start() message is issued. It is not clear whether the "Client thread" message will be printed before the "Server thread" message. It is a timing issue.

Figure 2: Client Implements Runnable Interface

   public class Client implements Runnable {
       public void run() {
           System.out.println(“Client thread”);
       }
   }
  
   public class Server {
   
       public static void main(String args[]) {
           Thread runner = new Thread(new Client());
           runner.start();
           runner.join();
           System.out.println(“Server thread”);
       }
   }

There are situations when the main thread can continue only after a worker thread has completed its work. To do this the main thread executes the join() method for that thread. When the Java code executes the Server class main method in Figure 2, it instantiates a new Client that has implemented the Runnable interface. The Client code inside the run() method executes when the start() message is issued. However, in this case, the "Client thread" message is printed before the "Server thread" message, because of the addition of the join() message. Java threads can be asynchronously terminated using the stop() method. However, Java has deprecated this method and discourages its use. Instead, the target thread should periodically check whether it should terminate by using the interrupt() method.

States of Java Threads

New

A thread is in this state when the thread object is first created with the new() method.

Runnable

Calling the start() method allocates memory for the new thread in the JVM and calls the run() method for the thread object.

Blocked

A thread becomes blocked if it performs a blocking statement, such as doing I/O or if it invokes a sleep() method.

Dead

A thread moves to the dead state when its run() method terminates.

Mutual Exclusion

The Java language implements mutual exclusion in a couple of ways. There exists a MutualExclusion interface. The Client class can implement this interface and code the methods, enteringCriticalSection() and leavingCriticalSection(), as shown in Figure 3. The yield() method tells the thread scheduler to allow another thread to run.

Figure 3: Client Implements MutualExclusion Interface

   public class Client implements MutualExclusion{
   
       public Client(){
           flag0 = false; flag1 = false; turn = TURN_0;
       }
       
       public void enteringCriticalSection(int t) {
           int other = 1 - t;
           
           if (t == 0) {
               flag0 = true; turn = other;
               while ((flag1 == true) && (turn == other))
                   Thread.yield();
            } else {
               flag1 = true; turn = other;
    	        while ((flag0 == true) && (turn == other))
                   Thread.yield();
            }
       }
       
       public void leavingCriticalSection(int t) {
           if(t == 0) flag0 = false;
           else   flag1 = false;
       }
       private volatile int turn, boolean flag0, flag1;
   }

A simpler method for implementing mutual exclusion is through the use of a semaphore. A semaphore is a variable that is accessed only through two standard operations: acquire() and release(). The thread acquires the semaphore immediately before entering the critical section of an object, and it releases the semaphore after the critical code completes. Figure 4 shows an example of this code. As napping is a critical operation for all programmers (after those all nighters), this Client thread places his 'nap' between the semaphore acquire() and semaphore release() messages, in order to not be disturbed.

Figure 4: Client Has Semaphore

   public class Client implements Runnable {
   
       public Client (Semaphore sem) {
           this.sem = sem;
       }
      
       public void run() {
           sem.acquire();
           SleepUtilities.nap(3000);
           sem.release();
       }
       private Semaphore sem;
   }

Ruby Threads

At the present time the Ruby programming language just supports green threads. The Ruby interpreter totally manages these threads. The threads, as well as the thread scheduler run on the same single operating system thread. Ruby can support thread methods, such as stop() and kill(), that are not advised for kernel threads. However, Ruby threads suffer from the drawbacks of non-native threads. They can only run on one processor in a multiprocessor environment. It is possible for a single thread to cause the whole process to deadlock, if poorly designed. The worse problem with Ruby threads is with I/O. Network access can block a process for a long time, for instance. Using non-blocking I/O, one can solve this problem. The DNS lookup system call has a threading issue, not solvable using non-blocking I/O. Ruby solves this problem with their resolv library. See Figure 5 for an example of the code that handles this issue. Despite some of the green thread concerns, for most situations, the benefits of efficiency can far outweigh the disadvantages.

Figure 5: Ruby DNS Lookup Implementation

   require ‘socket’
   require ‘resolv-replace’
   count = 0
   Thread.critical = true
   thread = Thread.new { Thread.pass; loop { count += 1; } } 
   IPSocket.getaddress(www.ruby-lang.org) 
   count


You can create a Ruby thread using Thread.new() {block}. Ruby passes the arguments given in the new() method to the block code that the thread executes. See Figure 6 for an example of creating a Ruby thread. The Ruby pass() method is analogous to the Java yield() method. It tells the thread scheduler to pass execution to another thread. Ruby has a join() method that performs the same functionality as the Java join(). There are run() and wakeup() methods that wake up a sleeping thread, either giving the thread control or indicating that it is ready to be scheduled.

Figure 6: Creating Ruby Threads

   threads = []
   4.times do |number|
       threads << Thread.new(number) do |i|
           print “#{i}\n”    
       end 
   end
   threads.each { |t| t.join }  

The output from this code is: 1

                             2
                             3
                             4 

States of Ruby Threads

Run

The thread is executing.

Sleep

The thread is sleeping or waiting on I/O.

Aborting

The thread is aborting (has been killed).

False

The thread has terminated normally.

Nil

The thread has terminated with an exception.

Mutual Exclusion

Ruby handles mutual exclusion with the Monitor class. With a monitor you place critical section code in a synchronize block that prevent access to the resource by another thread until the critical section completes. See Figure 7 for an example of how to implement a monitor. Both the 't1' and 't2' threads issue the 'tick' message to update the @count instance variable. Only one of the threads can be in the synchronize block at a time. If you code the tick() method without this block, the tick counts would be incorrect. Updating the count variable takes more than one hardware instruction. With both threads doing the same set of instructions simultaneously, the groups of hardware instructions incrementing count are interleaved causing the counts to be off. Synchronization prevents this from happening. This implementation of mutual exclusion in Ruby is almost identical to the use of semaphores in Java.

Figure 7: Ruby Monitor

   require 'monitor'
       class Counter < Monitor
           attr_reader :count
       
           def initialize
               @count = 0
               super
           end

           def tick
               synchronize do
               @count += 1
           end
       end
   end
   c = Counter.new
   t1 = Thread.new { 1000.times {  c.tick } } 
   t2 = Thread.new { 1000.times {  c.tick } }
   t1.join; t2.join
   c.count

The output of this code is: 2000

Java vs Ruby Threads

Java thread initialization is more involved than Ruby thread initialization. Ruby threads also have the advantage of a “quick and dirty” thread creation, since it only takes couple of lines of code. A Ruby thread shares all global, instance and local variables that are in existence at the time the thread starts. A Java thread can share these variables, as well, depending on how the thread was created. If the client thread class uses inheritance, any variables needed are passed along when creating the thread. If the thread is created through the interface, the variables can be shared as long as the class has been initialized and the variables in use.

Ruby’s green threads are completely portable, as they don’t rely on the OS. But, on a multi-core processor, native thread implementations can assign work to multiple processors while green threads cannot. In this environment native threads have a huge advantage as more work is done by the native threads.

Java and Ruby Methods Comparison

Table 1 lists thread actions and Java and Ruby’s equivalent methods of achieving them.

Table 2 lists thread actions available in Java, but Ruby doesn’t have a method for the action.


Table 3 lists thread actions available in Ruby, but Java doesn’t have a method for the action.

There are many similarities between the Java and Ruby Thread classes, as shown with the equivalent methods in Table 1. But, as you can see, there are also many methods in Java that Ruby doesn't support and vice versa. Some methods like activeCount(), where it returns the number of active threads, are trivial, but there is a group of methods existing in both Java and Ruby that the other doesn't support. For Java, it is exception handling with getting/setting exceptions, while in Ruby, it is more thread control. Ruby has exception methods, but they are geared more toward events prior to the exception, like making sure all threads abort when exception starts and firing an exception to destroy threads. Java is more about setting the exceptions and getting them to change accordingly. So the Java Thread class has a little bit more exception handling than Ruby.

Ruby has more thread control for the programmer such as pausing, terminating, aborting and resuming threads. Java did have these functions, but they all have been deprecated because, although they give the user more control, they also can cause deadlock problems. So it is a tradeoff between more thread control and less chance for deadlocks to happen. Java does use interrupt to "pause" the thread as opposed to actually pausing the thread as in Ruby. Also, Java threads "sleep" differently than Ruby threads. Java sleeps are timed, while Ruby's sleep just stops the thread indefinitely, until it gets woken up by "thr.wakeup". As you can see Ruby's thread methods are more flexible, but potentially more dangerous than Java’s.

Basically, Java and Ruby both have similar thread functionalities. In some respects, Java is simpler by reducing user error, while Ruby gives the programmer more power and perhaps more headaches. If a user wants more control, Ruby threading is great, but for simplicity Java is better. The main distinction between Ruby threads and Java threads is that currently Ruby just supports green threads (as did Java 1.1) and Java has support for kernel threads. Currently, there is no clear “winner” between native and green threads in a uni-processor system, but Java threads definitely have the edge over Ruby in a multi-processor environment.

Ruby is moving from green threads to kernel threads in Ruby 1.9 or 2.0, which are still development releases. YARV has been integrated as the new Ruby VM. YARV will give Ruby kernel thread support.

Java vs. Ruby Thread Efficiency

In order to test the efficiency of Java and Ruby threads two programs were written in each language. The first one reads in the contents of five text files, one for each of five threads, searches for a given word in each and counts each occurrence of the word. This count is added to a total count for all five files. Each file is passed to its own thread to run. The main thread has the total count, which is updated using mutual exclusion techniques in each language. The second program was modified from the first. Instead of searching for words in files, a string was passed to each thread. The times it took each to run each program depended on the size of the files, to a small extent and on the programming language to a larger extent. The results are recorded in Table 4. The file versions of these programs are given in Figures 8 and 9. With the small file, the Ruby code ran about as fast as the Java code. With the large file, the Java code ran five times faster than Ruby. This is evidence that when doing file I/O, Java's native threads are much more efficient. The string versions of these programs are not included. Both the Java code and the Ruby code processed the strings in approximately the same about of time. It was expected that Ruby code would outperform Java with this version.

Table 4: WordCounts Running Times
File Sizes Ruby Times Java Times
10KB 16 millisecs 15 millisecs
80KB 119 millisecs 20 millisecs
String Sizes Ruby Times Java Times
10KB 15 millisecs 16 millisecs
60KB 20 millisecs 22 millisecs

Figure 8: Java Word Counts Code

   public class WordCounts {
       protected static int wordCount;
       
       private class Semaphore {
           private int mutex;  
     
           public Semaphore () {
               mutex = 1;
           }
      
           public synchronized void acquire() throws InterruptedException {
               while (mutex == 0) {
                  wait();
               }
               mutex--;
           }
           public synchronized void release() {
               mutex++;    
               notifyAll();
           }
       }
   
       private class WordCountsThread implements Runnable {
           private String word;
           private String file;        
           private int count;
           private Semaphore total; 
                  
           public WordCountsThread (String word, String file, Semaphore total) {
               this.word  = word; 
               this.file  = file;
               this.total = total;
               count      = 0;
           }    
    
           public void run(){
               readWordFile();
       
               try {
                   total.acquire();
                   wordCount = wordCount + count;
                   total.release();
               } catch (InterruptedException e) {
                   System.out.println (e);
               }
           }
    
           private void readWordFile (){
           }   // end method
       } //end inner class: WordCountThread
    
   
       public WordCounts (String searchWord, String [] files) {
           int numFiles = files.length;
           Semaphore total = new Semaphore();
           Thread [] threads = new Thread [numFiles];
         
           for (int i = 0; i < numFiles; i++ ) {
               threads[i] = new Thread (new WordCountsThread (searchWord,files[i], total));
           }
       
           for (int j = 0; j < numFiles; j++ ) {
               threads[j].start();
           }
       
           try {
               for (int i = 0; i < numFiles; i++ ) {
                   threads[i].join();
               } 
           } catch (InterruptedException e) {
               System.out.println (e);
           }
       }
       
       public void displayTotal (String searchWord) {
           System.out.println ("There are " + wordCount + 
                           " total occurrences of " + searchWord);
       }
   
       public static void main (String [] args) {
           String [] files;              
           files = new String [5];   
           files [0] = "Test1.txt";
           files [1] = "Test2.txt";
           files [2] = "Test3.txt";
           files [3] = "Test4.txt";
           files [4] = "Test5.txt";
           String searchWord = "the";
           long time = System.currentTimeMillis (), time_prev = time;
           WordCounts wc = new WordCounts (searchWord, files);
           wc.displayTotal (searchWord);
           time = System.currentTimeMillis ();
           System.out.println ("Diff " + (time - time_prev) + " msecs");
           System.exit(0);
       }
   }

Sample output from a run of this code is:

There are 35 total occurrences of the Diff 15 msecs


Figure 9: Ruby Word Counts Code

   class WordCounts < Monitor
       attr_reader :wordCount, :numFiles, :searchWord
     
       def initialize
           super
       end # initialize def
       
       def initVars(files, searchWord)
           @wordCount  = 0
           @files      = files
           @searchWord = searchWord
       end # initVars        
      
       def countWordsInFiles
            threads = []
  
           for fileToFetch in @files
               threads << Thread.new(fileToFetch) do |file|
               count = readWordFile(file)
               updateTotal(count)
           end
       end
       threads.each {|t| t.join}
   end # countWordsInFiles 
     
   def readWordFile(file)
       ...
       return count
   end # readWordFile def  
   
   def displayTotal
       puts ("\nThere are " + @wordCount.to_s + " total occurrences of '" + @searchWord[0] + "'")
   end # displayTotal def 
 
   def updateTotal (count)
       synchronize do
           @wordCount += count
       end
   end # updateTotal def
   end # WordCounts class def
   
   files = %w( Test1.txt Test2.txt Test3.txt Test4.txt Test5.txt)
   searchWord = %w( public )
   wc = WordCounts.new
   wc.initVars(files, searchWord)
   ustart = Time.now.usec
   wc.countWordsInFiles
   wc.displayTotal
   t = Time.now
   uend = Time.now.usec
   puts ("Start: " + ustart.to_s + " usecs")
   puts ("End: " + uend.to_s + " usecs")
   puts ("Diff: " + ((uend - ustart)/1000.0).to_s + " msecs")

Sample output from a run of this code is:

There are 35 total occurrences of 'the' Diff 16 msecs

References

The following are links to more detailed information about threads, processes, and mutual exclusion:

http://en.wikipedia.org/wiki/Thread_(computer_science)

http://en.wikipedia.org/wiki/Green_threads

http://en.wikipedia.org/wiki/Mutual_exclusion

http://www.bitwiese.de/2007/09/on-processes-and-threads.html

http://www.cs.auckland.ac.nz/compsci340s2c/lectures/lecture05.pdf

http://www.ruby-doc.org/core/classes/Thread.html


The following are links to more detailed information about Ruby threads:

http://rubylearning.com/satishtalim/ruby_threads.html

http://snippets.amanzi.org/2007/07/3-ruby-threads.html

http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_threads.html

http://phrogz.net/ProgrammingRuby/ref_c_thread.html

http://www.infoq.com/news/2007/05/ruby-threading-futures

http://expressica.com/2008/04/26/new-in-ruby-19-threads/


The following are links to more detailed information about Java threads:

http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html

http://java.sun.com/javase/6/docs/api/java/lang/Thread.html

http://corelib.rubyonrails.com/classes/Thread.html