CSC/ECE 517 Summer 2008/wiki1 1 rp
Regular Expressions in Ruby vs. Java
Ruby and Java both support Regular Expressions, but generally speaking, Ruby's dynamic typing and native regular expression support allow for performing equivalent or similar functions more simply and with less code.
General Differences
Ruby can perform most regular expression related functions using a combination of the String and Regexp classes. The String class has several methods that take a Regexp as a parameter, and similarly the Regexp class has methods that take a String as a parameter. Ruby also provides a shorthand for defining regular expressions, a string surrounded by forward slashes: Regexp.new('test') and /test/ are equivalent.
Mostly, Java does not have native support for regular expressions. While the String class has a few methods that can perform related functions, they do not necessarily follow the conventional rules of regular expressions. Proper regular expression support is available in Java through several packages, most notably java.util.regex, which is Sun's standard package available in Java 1.4+. This package provides two classes, Pattern and Matcher, which are respectively used to define and operate on regular expressions. These classes work in conjunction with the String class to perform regular expression functions.
Code Example: find a regular expression match within a String
In Ruby, there are 2 simple ways to do this, the main difference between them being that one is a String method and one is a Regexp method. First the String method, the =~ operator, which returns the index in the string at which the pattern first matches.
"This is a string" =~ /is/
>> 2
"This is a string" =~ /hello/
>> nil
An equivalent way to do this is with the Regexp method match(). Note that match() returns a MatchData object if successful.
/is/.match("This is a string")
>> #<MatchData:0x5e1715c>
/hello/.match("This is a string")
>> nil
In Java, it is possible to do a simple version of this using only the String.matches() method, however this is a boolean method so we will miss out on the extra index information that Ruby's =~ will give.
String str = "This is a string";
str.matches("is"); // does not do what you expect. This will return false.
str.matches(".*is.*"); // will return true
In the above code, it is important to realize that matches() will only return true if the entire string from beginning to end is a match. In other words, it behaves as if the input regular expression "regexp" is actually "^regexp$". For matching a subsring or the entire string, a workaround such as the third line above must be used.