CSC/ECE 517 Summer 2008/wiki1 1 rp

From Expertiza_Wiki
Jump to navigation Jump to search

Regular Expressions in Ruby vs. Java

Ruby and Java both support Regular Expressions, but generally speaking, Ruby's dynamic typing and native regular expression support allow for performing equivalent or similar functions more simply and with less code.

General Differences

Ruby can perform most regular expression related functions using a combination of the String and Regexp classes. The String class has several methods that take a Regexp as a parameter, and similarly the Regexp class has methods that take a String as a parameter. Ruby also provides a shorthand for defining regular expressions, a string surrounded by forward slashes: Regexp.new('test') and /test/ are equivalent.

Mostly, Java does not have native support for regular expressions. While the String class has a few methods that can perform related functions, they do not necessarily follow the conventional rules of regular expressions. Proper regular expression support is available in Java through several packages, most notably java.util.regex, which is Sun's standard package available in Java 1.4+. This package provides two classes, Pattern and Matcher, which are respectively used to define and operate on regular expressions. These classes work in conjunction with the String class to perform regular expression functions.

Code Example: find a regular expression match within a String

In Ruby, there are 2 simple ways to do this, the main difference between them being that one is a String method and one is a Regexp method. First the String method, the =~ operator, which returns the index in the string at which the pattern first matches.

 "This is a string" =~ /is/
 >> 2
 "This is a string" =~ /hello/
 >> nil

An equivalent way to do this is with the Regexp method match(). Note that match() returns a MatchData object if successful.

 /is/.match("This is a string")
 >> #<MatchData:0x5e1715c>
 /hello/.match("This is a string")
 >> nil

In Java, it is possible to do a simple version of this using only the String.matches() method, however this is a boolean method so we will miss out on the extra index information that Ruby's =~ will give.

 String str = "This is a string";
 str.matches("is"); // does not do what you expect. This will return false.
 str.matches(".*is.*"); // will return true

In the above code, it is important to realize that matches() will only return true if the entire string from beginning to end is a match. In other words, it behaves as if the input regular expression "regexp" is actually "^regexp$". For matching a subsring or the entire string, a workaround such as the third line above must be used.

Code Example: collecting regular expression matches within a String in an array of Strings

In Ruby, a simple way to collect regular expression matches from a String is to use the String.scan() method, which takes a regular expression as a parameter, and returns a String array.

 matches = "This is a string".scan(/is/) 
 >> ["is", "is"]

A slightly more complex regular expression:

 matches = "This is a string".scan(/\w*i\w*/) # match any word with an i in it
 >> ["This", "is", "string"]

In Java, it is a bit more complicated. There are no shortcut methods as part of the String class to help us do what we want, and we need to make use of the Pattern and Matcher classes.

 String myString = new String("This is a string");
 Pattern myPattern = Pattern.compile("is");
 Matcher myMatcher = myPattern.matcher(myString);
 Vector results = new Vector(); // to store the resulting matches
while (myMatcher.find()) { // find() finds the next match, evaluates to false if not found results.add(myMatcher.group()); // group() returns the matched string }

Links

http://www.regular-expressions.info/ruby.html
http://www.regular-expressions.info/java.html
http://www.javaworld.com/javaworld/jw-07-2001/jw-0713-regex.html?page=3
http://regex.info/java.html
http://www.blueskyonmars.com/2003/10/31/14-stringmatches-is-dumb/
http://www.ruby-doc.org/core/classes/MatchData.html

Back to the assignment page