CSC/ECE 517 Fall 2009/wiki1b 5 j8: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 104: Line 104:
  if (preg_match("/l.*", "lee")) echo "Match";
  if (preg_match("/l.*", "lee")) echo "Match";


would print "Match".  The preg_match function is syntastically equivelant to perl's regular expressions. If POSIX regular expressions are preferred, simply use the "eregi" function which takes the same arguments.  Substitutions are done via the preg_replace function:
would print "Match".  The preg_match function is syntastically equivelant to perl's regular expressions.   Substitutions are done via the preg_replace function:


  preg_replace( "/e+/" , "aa", "peewee")
  preg_replace( "/e+/" , "aa", "peewee")

Revision as of 01:30, 21 September 2009

Regular Expressions

Who Compare the support for regular expressions in Ruby, Python, PHP, Perl 5, and any other appropriate scripting language with each other. Also compare the syntactic features in these languages with Java's package-based support. What features and syntax do the languages have in common? Are some features supported by some languages and not by others? How robust and easy-to-use are regular expressions in all these languages?

Regular expressions are a critical part of most modern programming languages especially ones that deal string processing as a core part of their functionality. Although using regular expressions can change from language to language, the general principle is the same and similar syntax can generally used across the board.


Usage

Perl

Perl has regular expresions built into the language itself via the '=~' operator. A simple match could be done like this:

$string = "lee";
if ($string =~ m/l.*/) {
   print "Matches";
}

This would print "Matches" since 'lee' starts with an 'l' and has zero or more characters after the 'l'. Replacements can be done simply by using 's' to indicate substitutions:

$string = "peewee";
$string =~ s/e+/aa/g;
print $string

This would print "paawaa". The 'g' following the regular expression indicates a global replacement, simply omit this to only replace the first instance of 'e+", which would result in print "paawee".

Java

Unlike many languages Java does not have built-in language support for regular expressions. It instead uses Pattern objects to process regular expressions.

    Pattern patt = Pattern.compile("l.*");
    Matcher match = patt.matcher("lee");
    return match.matches();

This would return true. Since the Pattern object is created with the regular expression, it can be reused with different inputs for increased speed.


    Pattern patt = Pattern.compile("l.*");
    Matcher match = patt.matcher("eel");
    return match.matches();

This would return false since 'eel' does not start with an 'l'. If a developer simply wants to a regular expression once and does not care to reuse the Pattern, he or she can simply use the 'matches' static method within Patthern:

   Pattern.matches("l.*", "lee");

or they can simply do operations on the String:

  String str = "lee";
  str.matches("l.*");

Replacements are done using:

 String str = "peewee";
 str.replaceAll("e+", "aa")

This would change the sting 'peewee' to 'paawaa', by replacing one or more instance of the letter 'e' with two 'a's. If you just wanted to replace the first instace you would use:

 String str = "peewee";
 str.replaceAll("e+", "aa")

which would change the string to 'paawee'.

Ruby

Ruby's support for regular expressions is very similar to perl's, but with some differences. Matches are done in the exact same manner:

str = "lee" if (str =~ /l.*/)

       print "Matches"

end

This would print "Matches".

Substitutions are one point where ruby greatly differs from perl. Instead of using the "s/regex/replace/" format, the functions sub, gsub, sub!, and gsub! can be called on any string. sub and gsub simply return a new string with the specified substitution, whereas sub! and gsub! do an in place substitution. gsub differs from sub in that it does a global replacement instead of simply replacing the first instance.

str = "peewee" print str.gsub(/e+/, "aa")

would print 'paawaa'.

Python

Python, similarly to java, does not have built in language support for regular expressions. It does however, like java, provide support for regular expressions through built in libraries. In python this is the 're' library. A simple match test can be done as followed:

import re
if re.match("l.*", "lee"):
   print "Match"

The above would print "Match". For substitution, python uses the "sub" function:

re.sub("e+", "aa", "peewee")

Would would return "paawaa". To replace only the first instace of 'ee', you would simply pass in the optional argument of '1':

re.sub("e+", "aa", "peewee", 1)

Which would return "paawee". The 1 argument tells the sub method to only substitute the first match.

Php

Php also does not have built in language support for regular expressions. To do a matching search simply use the preg_match function:

if (preg_match("/l.*", "lee")) echo "Match";

would print "Match". The preg_match function is syntastically equivelant to perl's regular expressions. Substitutions are done via the preg_replace function:

preg_replace( "/e+/" , "aa", "peewee")

This would return "paawaa". Similarly to python, if you provide an optional argument of '1', only the first instance of the pattern is replaced.

Advanced Features

Unicode

All of the above support Unicode and internationalized strings. There are however many caveats:

  • Ruby did not have support until version 1.9.
  • Perl did not have support until version 5.6
  • PHP supports it, but requires the use of a /u flag.

References

http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html
http://www.tutorialspoint.com/ruby/ruby_regular_expressions.htm
http://docs.python.org/library/re.html
http://yokolet.blogspot.com/2008/09/ruby-19s-unicode-regular-expression.html
http://www.regular-expressions.info/unicode.html