CSC/ECE 517 Summer 2008/wiki1 4 wm
Introduction
A thread is a basic unit of CPU utilization. A traditional process has a single thread of control. Many modern operating systems provide features enabling a process to contain multiple threads of control. If the process has multiple threads of control, it can do more than one task at a time, such as, displaying graphics and reading keystrokes. Threads are used mainly to run asynchronous tasks and pass the application’s tasks to an executor. They are useful because they reduce the overall time of execution of programs. All threads belonging to the same process share its code section, data section and other operating system resources, such as open files and signals. Thus multithreading is more efficient than having parallel processes running for the same program, which require a huge overhead.
Multi-threaded Programming
Multithreading support for single processors works by giving each of the threads a “time slice” of the CPU, similar to the parallel execution of processes. On multi-core machines, different threads can run on the different processors and the threads truly run simultaneously. Support for threads may be provided at either the user level, for user or green threads, or by the kernel, for kernel or native threads. User threads are supported above the kernel and are managed without kernel support, whereas kernel threads are supported and managed directly by the operating system.
Multi-threading Models
There are three common multithreading models
One-to-One
This model maps each user-thread to a kernel thread. It provides more concurrency than the many-to-one model by allowing another thread to run when a thread makes a blocking system call. It also allows multiple threads to run in parallel on multiprocessors. The main disadvantage of this model is the overhead of creating each kernel thread for each user thread. This burdens the performance of an application and limits the number of threads supported by the system.
Many-to-One
This model maps many user-level threads to one kernel thread. It is the model used by green threads. Green threads are scheduled by a Virtual Machine instead of natively by the OS. They emulate multithreaded environments without relying on any native OS capabilities. These user-level threads are lightweight and very efficient. They are an easy way to achieve parallelism in a program. An application can have as many user-level threads as it needs, but true concurrency is not achieved because the kernel can schedule only one thread at a time. The developers can create as many user threads as necessary. However, there are two main disadvantages:
- If a thread makes a blocking system call, the entire process will block.
- If you are running on a multi-core machine, the multiple threads are unable to run in parallel.
Many-to-Many
This model multiplexes many user-level threads to a smaller or equal number of kernel threads. This model has the best of both worlds. Developers can create as many user threads as they need. When a user thread performs a blocking system call, the kernel can schedule another thread for execution and the kernel threads can run in parallel on multi-core systems.
Benefits of Multi-threading
There are many benefits of multithreaded programming, including
Resource sharing
Threads share memory, code and process resources. Allocating memory and resources for process creation is costly and time consuming. It is more economical to create threads than new processes. If there are several different threads of activity within the same address space, applications can benefit.
Responsiveness
If an interactive application is multithreaded, this may allow it to continue running, even if part of it is blocked doing IO or performing a lengthy operation. This will increase responsiveness to the user. One thread of a web browser can display images or text, while another thread retrieves data from the network. Many clients can concurrently access a busy web server. If the web server ran as a traditional single-threaded process, it could service only one client at a time.
Potential Problems of Multi-threading
Many threaded programs have shared resources. More than one thread may need to access these resources. These programs must implement a form of mutual exclusion to ensure that only one of the threads gets access to the resource at a time. Access to these shared resources must execute in code called critical sections. Use hardware or software implementations to protect these critical sections.
Java Threads
The Java programming language has a thread library for creating and managing threads at the user-level. The Java virtual machine (JVM) manages the mapping of the user threads to the operating system kernel threads. The particular mapping model in use depends on the operating system on which the Java program runs. Java’s thread scheduler monitors all the threads running in Java programs. It decides which threads to run at any given time, and when to switch between threads based on the thread’s priority. The Java scheduler uses either preemptive or non-preemptive scheduling based on the operating system. With preemptive scheduling, the scheduler gives each thread a constant period of time to run, after which time the thread is suspended() to allow the next thread to resume() running. With non-preemptive scheduling, the running thread runs until the thread completes or until it issues a yield() to allow other threads to run. To create a Java thread, either have your Java class extend the Java Thread class, or have your Java class implement the Runnable interface and code a run() method. The two options are shown in Figures 1 and 2. The preferred method for creating a thread is to implement the Runnable interface. It is more flexible and useful in complex applications. Java does not support multiple inheritance, but a class may implement multiple interfaces. By using the interface method, the Client class is open to inherit another class, if the need arose in the future.
Figure 1: Client Extends Thread Class
public class Client extends Thread { public void run() { System.out.println(“Client thread”); } }
public class Server { public static void main(String args[]){ Thread runner = new Client(); runner.start(); System.out.println(“Server thread”); } }
When the Java code executes the Server class main method in Figure 1, it instantiates a new Client, which is a worker thread. The Client code inside the run() method executes when the start() message is issued. It is not clear whether the "Client thread" message will be printed before the "Server thread" message. It is a timing issue.
Figure 2: Client Implements Runnable Interface
public class Client implements Runnable { public void run() { System.out.println(“Client thread”); } } public class Server { public static void main(String args[]) { Thread runner = new Thread(new Client()); runner.start(); runner.join(); System.out.println(“Server thread”); } }
There are situations when the main thread can continue only after a worker thread has completed its work. To do this the main thread executes the join() method for that thread. When the Java code executes the Server class main method in Figure 2, it instantiates a new Client that has implemented the Runnable interface. The Client code inside the run() method executes when the start() message is issued. However, in this case, the "Client thread" message is printed before the "Server thread" message, because of the addition of the join() message. Java threads can be asynchronously terminated using the stop() method. However, Java has deprecated this method and discourages its use. Instead, the target thread should periodically check whether it should terminate by using the interrupt() method.
States of Java Threads
New
A thread is in this state when the thread object is first created with the new() method.
Runnable
Calling the start() method allocates memory for the new thread in the JVM and calls the run() method for the thread object.
Blocked
A thread becomes blocked if it performs a blocking statement, such as doing I/O or if it invokes a sleep() method.
Dead
A thread moves to the dead state when its run() method terminates.
Mutual Exclusion
The Java language implements mutual exclusion in a couple of ways. There exists a MutualExclusion interface. The Client class can implement this interface and code the methods, enteringCriticalSection() and leavingCriticalSection(), as shown in Figure 3. The yield() method tells the thread scheduler to allow another thread to run.
Figure 3: Client Implements MutualExclusion Interface
public class Client implements MutualExclusion{ public Client(){ flag0 = false; flag1 = false; turn = TURN_0; } public void enteringCriticalSection(int t) { int other = 1 - t; if (t == 0) { flag0 = true; turn = other; while ((flag1 == true) && (turn == other)) Thread.yield(); } else { flag1 = true; turn = other; while ((flag0 == true) && (turn == other)) Thread.yield(); } } public void leavingCriticalSection(int t) { if(t == 0) flag0 = false; else flag1 = false; } private volatile int turn, boolean flag0, flag1; }
A simpler method for implementing mutual exclusion is through the use of a semaphore. A semaphore is a variable that is accessed only through two standard operations: acquire() and release(). The thread acquires the semaphore immediately before entering the critical section of an object, and it releases the semaphore after the critical code completes. Figure 4 shows an example of this code. As napping is a critical operation for all programmers (after those all nighters), this Client thread places his 'nap' between the semaphore acquire() and semaphore release() messages, in order to not be disturbed.
Figure 4: Client Has Semaphore
public class Client implements Runnable { public Client (Semaphore sem) { this.sem = sem; } public void run() { sem.acquire(); SleepUtilities.nap(3000); sem.release(); } private Semaphore sem; }
Ruby Threads
At the present time the Ruby programming language just supports green threads. The Ruby interpreter totally manages these threads. The threads, as well as the thread scheduler run on the same single operating system thread. Ruby can support thread methods, such as stop() and kill(), that are not advised for kernel threads. However, Ruby threads suffer from the drawbacks of non-native threads. They can only run on one processor in a multiprocessor environment. It is possible for a single thread to cause the whole process to deadlock, if poorly designed. The worse problem with Ruby threads is with I/O. Network access can block a process for a long time, for instance. Using non-blocking I/O, one can solve this problem. The DNS lookup system call has a threading issue, not solvable using non-blocking I/O. Ruby solves this problem with their resolv library. See Figure 5 for an example of the code that handles this issue. Despite some of the green thread concerns, for most situations, the benefits of efficiency can far outweigh the disadvantages.
Figure 5: Ruby DNS Lookup Implementation
require ‘socket’ require ‘resolv-replace’ count = 0 Thread.critical = true thread = Thread.new { Thread.pass; loop { count += 1; } } IPSocket.getaddress(www.ruby-lang.org) count
You can create a Ruby thread using Thread.new() {block}. Ruby passes the arguments given in the new() method to the block code that the thread executes. See Figure 6 for an example of creating a Ruby thread. The Ruby pass() method is analogous to the Java yield() method. It tells the thread scheduler to pass execution to another thread. Ruby has a join() method that performs the same functionality as the Java join(). There are run() and wakeup() methods that wake up a sleeping thread, either giving the thread control or indicating that it is ready to be scheduled.
Figure 6: Creating Ruby Threads
threads = [] 4.times do |number| threads << Thread.new(number) do |i| print “#{i}\n” end end threads.each { |t| t.join }
The output from this code is: 1
2 3 4
States of Ruby Threads
Run
The thread is executing.
Sleep
The thread is sleeping or waiting on I/O.
Aborting
The thread is aborting (has been killed).
False
The thread has terminated normally.
Nil
The thread has terminated with an exception.
Mutual Exclusion
Ruby handles mutual exclusion with the Monitor class. With a monitor you place critical section code in a synchronize block that prevent access to the resource by another thread until the critical section completes. See Figure 7 for an example of how to implement a monitor. Both the 't1' and 't2' threads issue the 'tick' message to update the @count instance variable. Only one of the threads can be in the synchronize block at a time. If you code the tick() method without this block, the tick counts would be incorrect. Updating the count variable takes more than one hardware instruction. With both threads doing the same set of instructions simultaneously, the groups of hardware instructions incrementing count are interleaved causing the counts to be off. Synchronization prevents this from happening. This implementation of mutual exclusion in Ruby is almost identical to the use of semaphores in Java.
Figure 7: Ruby Monitor
require 'monitor' class Counter < Monitor attr_reader :count def initialize @count = 0 super end def tick synchronize do @count += 1 end end end
c = Counter.new t1 = Thread.new { 1000.times { c.tick } } t2 = Thread.new { 1000.times { c.tick } } t1.join; t2.join c.count
The output of this code is: 2000
Java vs Ruby Threads
Java thread initialization is more involved than Ruby thread initialization. Ruby threads also have the advantage of a “quick and dirty” thread creation, since it only takes couple of lines of code. A Ruby thread shares all global, instance and local variables that are in existence at the time the thread starts. A Java thread can share these variables, as well, depending on how the thread was created. If the client thread class uses inheritance, any variables needed are passed along when creating the thread. If the thread is created through the interface, the variables can be shared as long as the class has been initialized and the variables in use.
Ruby’s green threads are completely portable, as they don’t rely on the OS. But, on a multi-core processor, native thread implementations can assign work to multiple processors while green threads cannot. In this environment native threads have a huge advantage as more work is done by the native threads.
Java and Ruby Methods Comparison
Table 1 lists thread actions and Java and Ruby’s equivalent methods of achieving them.
Table 2 lists thread actions available in Java, but Ruby doesn’t have a method for the action.
Table 3 lists thread actions available in Ruby, but Java doesn’t have a method for the action.
There are many similarities between the Java and Ruby Thread classes, as shown with the equivalent methods in Table 1. But, as you can see, there are also many methods in Java that Ruby doesn't support and vice versa. Some methods like activeCount(), where it returns the number of active threads, are trivial, but there is a group of methods existing in both Java and Ruby that the other doesn't support. For Java, it is exception handling with getting/setting exceptions, while in Ruby, it is more thread control. Ruby has exception methods, but they are geared more toward events prior to the exception, like making sure all threads abort when exception starts and firing an exception to destroy threads. Java is more about setting the exceptions and getting them to change accordingly. So the Java Thread class has a little bit more exception handling than Ruby.
Ruby has more thread control for the programmer such as pausing, terminating, aborting and resuming threads. Java did have these functions, but they all have been deprecated because, although they give the user more control, they also can cause deadlock problems. So it is a tradeoff between more thread control and less chance for deadlocks to happen. Java does use interrupt to "pause" the thread as opposed to actually pausing the thread as in Ruby. Also, Java threads "sleep" differently than Ruby threads. Java sleeps are timed, while Ruby's sleep just stops the thread indefinitely, until it gets woken up by "thr.wakeup". As you can see Ruby's thread methods are more flexible, but potentially more dangerous than Java’s.
Basically, Java and Ruby both have similar thread functionalities. In some respects, Java is simpler by reducing user error, while Ruby gives the programmer more power and perhaps more headaches. If a user wants more control, Ruby threading is great, but for simplicity Java is better. The main distinction between Ruby threads and Java threads is that currently Ruby just supports green threads (as did Java 1.1) and Java has support for kernel threads. Currently, there is no clear “winner” between native and green threads in a uni-processor system, but Java threads definitely have the edge over Ruby in a multi-processor environment.
Ruby is moving from green threads to kernel threads in Ruby 1.9 or 2.0, which are still development releases. YARV has been integrated as the new Ruby VM. YARV will give Ruby kernel thread support.
Java vs. Ruby Thread Efficiency
In order to test the efficiency of Java and Ruby threads two programs were written in each language. The first one reads in the contents of five text files, one for each of five threads, searches for a given word in each and counts each occurrence of the word. This count is added to a total count for all five files. Each file is passed to its own thread to run. The main thread has the total count, which is updated using mutual exclusion techniques in each language. The second program was modified from the first. Instead of searching for words in files, a string was passed to each thread. The times it took each to run each program depended on the size of the files, to a small extent and on the programming language to a larger extent. The results are recorded in Table 4. The file versions of these programs are given in Figures 8 and 9. With the small file, the Ruby code ran about as fast as the Java code. With the large file, the Java code ran five times faster than Ruby. This is evidence that when doing file I/O, Java's native threads are much more efficient. The string versions of these programs are not included. Both the Java code and the Ruby code processed the strings in approximately the same about of time. It was expected that Ruby code would outperform Java with this version.
File Sizes | Ruby Times | Java Times |
---|---|---|
10KB | 16 millisecs | 15 millisecs |
80KB | 119 millisecs | 20 millisecs |
String Sizes | Ruby Times | Java Times |
10KB | 15 millisecs | 16 millisecs |
60KB | 20 millisecs | 22 millisecs |
Figure 8: Java Word Counts Code
public class WordCounts { protected static int wordCount; private class Semaphore { private int mutex; public Semaphore () { mutex = 1; } public synchronized void acquire() throws InterruptedException { while (mutex == 0) { wait(); } mutex--; } public synchronized void release() { mutex++; notifyAll(); } } private class WordCountsThread implements Runnable { private String word; private String file; private int count; private Semaphore total; public WordCountsThread (String word, String file, Semaphore total) { this.word = word; this.file = file; this.total = total; count = 0; } public void run(){ readWordFile(); try { total.acquire(); wordCount = wordCount + count; total.release(); } catch (InterruptedException e) { System.out.println (e); } } private void readWordFile (){ } // end method } //end inner class: WordCountThread public WordCounts (String searchWord, String [] files) { int numFiles = files.length; Semaphore total = new Semaphore(); Thread [] threads = new Thread [numFiles]; for (int i = 0; i < numFiles; i++ ) { threads[i] = new Thread (new WordCountsThread (searchWord,files[i], total)); } for (int j = 0; j < numFiles; j++ ) { threads[j].start(); } try { for (int i = 0; i < numFiles; i++ ) { threads[i].join(); } } catch (InterruptedException e) { System.out.println (e); } } public void displayTotal (String searchWord) { System.out.println ("There are " + wordCount + " total occurrences of " + searchWord); } public static void main (String [] args) { String [] files; files = new String [5]; files [0] = "Test1.txt"; files [1] = "Test2.txt"; files [2] = "Test3.txt"; files [3] = "Test4.txt"; files [4] = "Test5.txt"; String searchWord = "the"; long time = System.currentTimeMillis (), time_prev = time; WordCounts wc = new WordCounts (searchWord, files); wc.displayTotal (searchWord); time = System.currentTimeMillis (); System.out.println ("Diff " + (time - time_prev) + " msecs"); System.exit(0); } }
Sample output from a run of this code is:
There are 35 total occurrences of the Diff 15 msecs
Figure 9: Ruby Word Counts Code
class WordCounts < Monitor attr_reader :wordCount, :numFiles, :searchWord def initialize super end # initialize def def initVars(files, searchWord) @wordCount = 0 @files = files @searchWord = searchWord end # initVars def countWordsInFiles threads = [] for fileToFetch in @files threads << Thread.new(fileToFetch) do |file| count = readWordFile(file) updateTotal(count) end end threads.each {|t| t.join} end # countWordsInFiles def readWordFile(file) ... return count end # readWordFile def def displayTotal puts ("\nThere are " + @wordCount.to_s + " total occurrences of '" + @searchWord[0] + "'") end # displayTotal def def updateTotal (count) synchronize do @wordCount += count end end # updateTotal def end # WordCounts class def files = %w( Test1.txt Test2.txt Test3.txt Test4.txt Test5.txt) searchWord = %w( public ) wc = WordCounts.new wc.initVars(files, searchWord) ustart = Time.now.usec wc.countWordsInFiles wc.displayTotal t = Time.now uend = Time.now.usec puts ("Start: " + ustart.to_s + " usecs") puts ("End: " + uend.to_s + " usecs") puts ("Diff: " + ((uend - ustart)/1000.0).to_s + " msecs")
Sample output from a run of this code is:
There are 35 total occurrences of 'the' Diff 16 msecs
References
The following are links to more detailed information about threads, processes, and mutual exclusion:
http://en.wikipedia.org/wiki/Thread_(computer_science)
http://en.wikipedia.org/wiki/Green_threads
http://en.wikipedia.org/wiki/Mutual_exclusion
http://www.bitwiese.de/2007/09/on-processes-and-threads.html
http://www.cs.auckland.ac.nz/compsci340s2c/lectures/lecture05.pdf
http://www.ruby-doc.org/core/classes/Thread.html
The following are links to more detailed information about Ruby threads:
http://rubylearning.com/satishtalim/ruby_threads.html
http://snippets.amanzi.org/2007/07/3-ruby-threads.html
http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_threads.html
http://phrogz.net/ProgrammingRuby/ref_c_thread.html
http://www.infoq.com/news/2007/05/ruby-threading-futures
http://expressica.com/2008/04/26/new-in-ruby-19-threads/
The following are links to more detailed information about Java threads:
http://www.javaworld.com/javaworld/jw-04-1996/jw-04-threads.html