CSC/ECE 517 Summer 2008/wiki1 1 mf
Regular expressions in Ruby versus Java
Ruby supports regular expressions as a language feature without the inclusion of any special classes or modules. Java on the other hand does native regular expression support and it requires the use of special regular expression packages to use them.
Ruby regexp support in more depth
Since Ruby borrows it syntax from Perl the is nothing more than the simple syntax of:
/pattern/modifiers
This alone is what is needed to create an instance of the regular expression class.
Java regexp support in more depth
Support for regexp
Java has been around for a while have has never had native regexp support. Because of this regular expression packages had to be created. There was no comprehensive support for regexp support from Java's main contributor, Sun, until Java 4. Because of this there are multiple 3rd party regexp packages for Java floating around:
- java.util.regex The most widely used for regular expressions now due to its inclusion in the JDK since Java 4. This document will assume from this point forward that we are using this package for our regular expressions in Java.
- Jakarta Around since 1996, Jakarta was donated to the Apache Software Foundation and is under an open-source, BSD style license
- dk.brics.automaton Automaton is known for being the fastest of the Java regexp implementations
- And the list goes on...
Classes with regular expression abilities
String class
The String class provides simple regular expression support. It is the quickest way to write code to do matching, replacement, or splitting on a string. However, it is not very fast and therefore should not be used if performance is a factor. The String classes regular expression matching also has a severe limitation. Any regular expression passed to it will be interpreted as if it has to span the whole string. IE: ^ is appended to the front and $ is appended to the tail of your expression.
Pattern class
The Pattern class is a compiled representation of a regular expression. A regular expression is input as a string then compiled so that it can used repeatedly by the Matches class or a single time to provide a single match. Using it for a single match is a fairly inefficient use of the class.
Matcher class
The Matcher class is an engine that performs match operations on a string by interpreting a Pattern. A Matcher class is able to do matching and replacement. The Matcher class does have the ability to return all matches in a string, but only one string at a time.
Examples
Match a pattern
Search for text and replace
Collect matches
References
- Ruby Regexp Class - Regular Expressions in Ruby
- Using Regular Expressions in Java
- java.util.regex
- Jakarta
- dk.brics.automaton
-- Michael Frisch (Tuesday, June 3, 2008)