CSC/ECE 517 Fall 2011/ch4 4c dm

From Expertiza_Wiki
Jump to navigation Jump to search

"Lecture 6"

Regular Expressions

A regular expression provides a succinct and supple means for specifying and recognizing strings of text, such as particular characters, words, or patterns of characters.<ref name="defobject">Definition of Regular Expression</ref> Regular expressions are usually referred to by abbreviations like "regex" and "regexp". The concept of regular expressions was first popularized by utilities provided by Unix distributions. Regular expression processor is a program that interprets a regular expression written in a formal language. It either serves as a parser generator or examines text and identifies parts that match the provided specification.

We can demonstrate the power of regex with the help following list of some specifications that can be expressed using regular expressions:

  • the sequence of characters "far" appearing consecutively in any context, such as in "far", "farmer", or "defarge"
  • the sequence of characters "far" occurring in that order with other characters between them, such as in "filmstar"
  • the word "far" when it appears as an isolated word
  • the word "far" when preceded by the word "very" or "little"
  • the word "far" when not preceded by the word "went"

Ruby, built as a better Perl supports regular expressions as a language feature and its syntax is borrowed from Perl. In Ruby, a regular expression is written in the form of /pattern/modifiers where "pattern" is the regular expression itself, and "modifiers" are a series of characters indicating various options.<ref name="rubyregex">Regex Modifiers</ref> The "modifiers" part is optional.

Ruby supports the following modifiers:

Modifier Description
/i makes the regex match case insensitive
/m makes the dot match newlines
/x tells Ruby to ignore whitespace between regex tokens
/o causes any #{...} substitutions in a particular regex literal to be performed just once, the first time it is evaluated. Otherwise, the substitutions will be performed every time the literal generates a Regexp object.

Since forward slashes delimit the regular expression, any forward slashes that appear in the regex need to be escaped. E.g. the regex 3/4 is written as /3\/4/ in Ruby.

Search And Replace

The sub() and gsub() methods of the String class can be used search and replace the first regex match or all regex matches respectively. To use sub and gsub we have to specify the regular expression we want to search for as the first parameter, and the replacement string as the second parameter.

The following statement replaces the first occurrence of 'c' in faculty with "".

   faculty".sub(/c/, "")
   => "faulty"

The following statement replaces all occurrences of 'e' in cheese with 'o'

   "cheese".gsub(/e/, "o")
   => "chooso"

Character Classes

A character class is delimited with square brackets ([, ]) and lists characters that may appear at that point in the match. /[dm]/ means d or m, as opposed to /dm/ which means d followed by m. The following examples uses character class.

   /Pr[aeiou]gram/.match("Program") #=> #<MatchData "Program">
   "anecdote".sub(/[aeio]/, "u")
   =>unecdote

If the first character of a character class is a caret (^) the class is inverted: it matches any character except those named. The following example depicts the use of carat.

   "anecdote".sub(/[^aeio]/, "z")
   =>azecdote

In a character class, we can use the hyphen (-). This is a metacharacter denoting an inclusive range of characters. [pqrs] is equivalent to [p-s]. A range can follow another range, so [abcdpqrs] is equivalent to [a-dp-s]. The order in which ranges or individual characters appear inside a character class is irrelevant.

The following line will change the first occurrence of any character in the range from a-y in the given string with z

   "now is the time".sub(/[a-y]/, "z")
   => "zow is the time"

The following line will change the all occurrences of any character in the range from a-y in the given string with z

   "now is the time".gsub(/[a-y]/, "z")
   => "zzz zz zzz zzzz"

The following table explains the the different types of character classes:<ref name="characterclass">Character Classes</ref>

Class Match description
[0-9] Decimal digit character
[^0-9] Not a decimal digit character
[\s\t\r\n\f] Whitespace character
[^\s\t\r\n\f] Not a whitespace character
[A-Za-z0-9_] Word character (alpha, numeric, and underscore)
[^A-Za-z0-9_] Not a word character
[:alnum:] Alpha numeric ([A-Za-z0-9])
[:alpha:] Uppercase and lowercase letters ([A-Za-z])
[:blank:] Blank or tab character
[:space:] Whitespace characters
[:digit:] Decimal digit characters
[:lower:] Lowercase letters ([a-z])
[:upper:] Uppercase characters
[:print:] Any printable character, including space
[:graph:] Printable characters excluding space
[:punct:] Punctuation characters: any printable character excluding aplhanumeric or space
[:cntrl] Chontrol characters (0x00 to 0x1F and 0x7F)
[:xdigit:] Hexadecimal digits ([0-9a-fA-F])

Repetition

So far we have seen how to match single characters. We can use repetition metacharacter to specify how many times they need to occur. Such metacharacters are called quantifiers.

  • * - Zero or more times
  • + - One or more times
  • ? - Zero or one times (optional)
  • {n} - Exactly n times
  • {n,} - n or more times
  • {,m} - m or less times
  • {n,m} - At least n and at most m times

The following line replaces one or more occurrences of 'o' in choose by 'e'

   "choose".sub(/o+/, "e")
   => "chese"

The following line replaces zero or more occurrences of 'o' in choose by 'e'

   "choose".sub(/o?/, "e")
   => "echoose"

Modules

Modules provide a structure to accumulate classes, methods and constants into a single, separately defined unit. This is useful so that one can avoid clashes with existing classes, methods, and constants, and also so that the functionality of the modules can be mixed-in to the classes. The definition of a module is very similar to a class. Also, modules and classes are closely related. The Module class is the immediate ancestor of the Class class. Just like a class, a module can contain constants, methods and classes.

In Ruby , a module is defined in the following way :

 module <module name>
   statement1
   statement2
   ...........
 end

Consider for example ,a module called MyModule , which defines the happy and the sad times.

 module MyModule 
  GOODMOOD = "happy"
  BADMOOD = "sad" 
  def greet 
  return "I'm #{GOODMOOD}. How are you?"
  end
  def MyModule.greet
  return "I'm #{BADMOOD}. How are you?" 
  end
 end 

The above represents a module MyModule with a constant GOODMOOD and an “instance method” greet. In addition to instance methods a module may also have module methods. Just as class methods are prefixed with the name of the class, module methods are prefixed with the name of the module as shown above in MyModule.greet.

In spite of their similarities, there are two major features which classes possess but modules do not: instances and inheritance. Classes can have instances (objects), superclasses (parents) and subclasses (children) whereas modules can have none of these. Inspite of inability of the modules to be initialized and inherited ,they provide a namespace and prevent name clashes and they implement the mixin facility.

Modules as Namespaces

Modules <ref>Modules </ref> can be considered as a named ‘wrapper’ around a set of methods, constants and classes. The various bits of code inside the module share the same ‘namespace’ and hence, they are all visible to each other but not visible to code outside the module.

Consider the example of the MyModule described above

We can access the module constants just as we would access class constants using the :: scope resolution operator like this:

 puts(MyModule::GOODMOOD)   #->  happy
 puts(MyModule::BADMOOD)   #->  sad

We can access module methods using dot notation – that is, specifying the module name followed by a period and the method name

For example

 puts( MyModule.greet )     # ->  I’m sad. How are you?

Since modules define a closed space, we cannot access the instance method “greet” from outside the module.

 puts greet   # -> NameError: undefined local variable or method `greet' for main:Object

In case of classes , we could have created instances of the class which would all have access to the instance methods of the class. However modules cannot be instantiated. This is where mixins come into the picture.

Mixins

A mixin <ref>Mixins </ref> is a class that is mixed with a module or a set of modules. In other words the implementation of the class and module are intertwined and combined. The real usage of a mixin is exploited when the code in the mixin starts to interact with code in the class that uses it.In order to mix the modules into the class we make use of the “include” method. As it is included, the instance methods in the modules can be used just as though it were a normal instance method within the current scope.

The process of including a module is also called ‘mixing in’, following which included modules are known as‘mixins’. When we mix the modules into a class , all the objects created from that class will be able to use the instance methods of the mixed-in module as if they were defined in the class.

 class MyClass
  include MyModule 
  def sayHi
  puts( greet )
  end 
 end 

Not only can the methods of this class access the greet method from MyModule, but so too can any objects created from the class:

 ob = MyClass.new 
 ob.sayHi     # -> I'm happy. How are you?
 puts(ob.greet)   # -> I'm happy. How are you?

Require / Load

As programs get bigger and bigger, the amount of reusable code also increases. It would best to break this code into separate files, so that these files can be shared across different programs. Typically the code organizes these files as class or module libraries. In order to incorporate the reusable code into new programs Ruby provides two statements.

 load "<filename.rb>”
 require “<filename>”

The load method includes the named Ruby source file every time the method is executed, whereas require loads any given file only once <ref>Require / Load / Include</ref>

Consider the following ruby module written in the file Week.rb.

 module Week
  FIRST_DAY = "Monday"
  def Week.weeks_in_month
     puts "You have four weeks in a month"
  end
  def Week.weeks_in_year
     puts "You have 52 weeks in a year"
  end
 end

This module defines a constant called FIRST_DAY and initializes it to "Monday". It also defines two methods which print the appropriate statement.

In order to include this module into a class in another file, we need to load the Week.rb file first and then include it in the class.

 class Decade
  require "Week"
  include Week
  no_of_yrs=10
  def no_of_months
     puts Week::FIRST_DAY
     number=10*12
     puts number
  end
 end
 d1=Decade.new
 puts Week::FIRST_DAY  # -> Monday
 puts Week.weeks_in_month   #-> You have four weeks in a month
 puts Week.weeks_in_year    # -> You have 52 weeks in a year
 puts d1.no_of_months      # -> Monday  120

The important things to consider is:

include makes features available, but does not execute the code.

require loads and executes the code one time (like a C #include).

load loads and executes the code every time it is encountered.

However, in order to allow mixins we always need to use include.

Comparable and Enumerable

Comparable

The Comparable is a built in mixin module that provides the neat ability to define one’s own comparison ‘operators’ such as <, <=, ==, >=. This is done by mixing the module into the class and defining the <=> method. Once this is done , it is then possible to specify the criteria for comparing some value from the current object with some other value. The <=> compares the receiver against another object and returns -1, 0, or +1 depending on whether the receiver is less than, equal to, or greater than the other object respectively.

Assume that we have a square class.

 class Square
  attr_reader :side
  def initialize(side)
  @side = side
  end
 end

In order to compare the areas of two Squares , we first need to include the Comparable mixin.

 class square
  include Comparable
  .
  .
  .
  def area
  side*side
  end
  def <=>(other)
  self.area<=>other.area
  end
 end

The <=> function (spaceship operator) uses Comparable mixin to compare the area of two Squares. We can call the Comparable methods on Square objects.

 s1 = Square.new(3)
 s2 = Square.new(5)
  if s1 < s2
  puts "The area of Square 1 is smaller than Square 2"   # -> This is the output printed.
  else if s1 > s2
  puts "The area of Square 1 is larger than Square 2"
  else
  puts "The area of Square 1 equals to Square 2"    
  end
 end

Composing modules

Enumerable is a standard mixin, which can be included in any class. It is a built in mix-in module for enumeration which provides collection classes with several traversal and searching methods, and with the ability to sort. The Enumerable method is already included in the Array class and provides arrays with a number of useful methods such as include? which returns true if a specific value is found in an array, min which returns the smallest value, max which returns the largest and collect which creates a new array made up of values returned from a block.

 arr = [1,2,3,4,5]
  y = arr.collect{ |i| i } #=> y = [1, 2, 3, 4]
  z = arr.collect{ |i| i * i } #=> z = [1, 4, 9, 16, 25] 
  arr.include?( 3 ) #=> true 
  arr.include?( 6 ) #=> false 
  arr.min #=> 1 
  arr.max #=> 5 

Another important method which is provided by the Enumerable module is the inject method. The inject method can be used to repeatedly apply an operation to adjacent elements in a collection.

Consider the example of summing the elements of an array.

 [1, 2, 3, 4].inject  { |result, element| result + element }   # => 10

The inject method takes an argument and a block. The block will be executed once for each element contained in the object that inject was called on ([1,2,3,4] in our example). The argument passed to inject will be yielded as the first argument to the block, the first time it's executed. The second argument yielded to the block will be the first element of the object that we called inject on. If a default value is not passed in as an argument when the block executes for the first time, the first argument will be set to the first element of the enumerable and the second argument will be set to the second element of the enumerable.

Since there is no default value passed in as an argument, the first time the block executes the first argument (result from the example) will be set to the first element of the array (1 from the example) and the second argument (element from the example) will be set to the second element of the enumerable (2 from the example).

The block will need to be executed 3 times, since the first execution will yield both the first and the second element. The first time the block executes it will add the result, 1, to the element, 2, and return a value of 3. The second time the block executes the result will be 3 and the element will also be 3 giving a return value of 6. The third and the final time the block executes, the result will be 6 and the element will be 4 , giving a return value of 10 which is the output.

The inject method and the other methods as mentioned above can be directly applied to built in classes such as Arrays , Range , Hash etc which contains the Enumerable module.

It is also possible to be able to apply Enumerable methods to classes which do not descend from existing classes which implement those methods. This can be done by including the Enumerable module in the class and then writing an iterator method called “each” . Consider a class Collection, which does not have the enumerable method in its ancestors.

 class Collection 
  include Enumerable
  def initialize( someItems ) 
  @items = someItems 
  end 
  def each 
  @items.each{ |i| 
  yield( i ) }
  end
 end

The Enumerable module is included in the Collection class. We can hence initialize a Collection object with an array, which will be stored in the instance variable, @items. When we call one of the methods provided by the Enumerable module (such as min, max or collect), this will ‘behind the scenes’ call the each method in order to obtain each piece of data one at a time.

 things = Collection.new(['x','yz','defgh','ij','klmno']) 
  puts( things.min ) #=>  "defgh" 
  puts( things.max ) #=>  "yz"
  puts( things.collect{ |i| i.upcase } ) #=> ["X", "YZ", "DEFGH", "IJ", "KLMNO"]

Multiple Inheritance

Multiple Inheritance refers to a feature in some object oriented programminglanguages in which a class can inherit behaviors from more than one superclass. This contrasts with single inheritance in which a class can inherit behavior from at most one superclass. Though Multiple Inheritance provides its own advantages of improved modularity and ease of reuse of code in certain applications, it has its own set of disadvantages which sometimes outweigh the possible advantages.

One of the ambiguities associated with multiple inheritance is the “diamond problem”. Different languages have approach this problem in different ways.

Java makes use of the concept of Interfaces <ref>Interfaces </ref>. Using interfaces, Java allows multiple inheritance of interfaces but implementation which consists of methods and variables maybe only singly inherited. This way , Java ensure that there is no confusion over which method to call of which instance variable to use.

Eiffel makes use of the renaming clause and ensures that the ancestor methods used in the descendants are explicitly specified. This allows the methods of the base class to be shared between all its descendants.

Ruby does not support multiple inheritance. Mixin provide a way which eliminates the need for multiple inheritance. Ruby has modules which are just like abstract classes <ref>Abstract Classes</ref> in Java. Modules can have different methods implemented in them. They cannot be instantiated like abstract classes. They cannot inherit from other modules or classes. Ruby classes can inherit from only one superclass but can have unlimited number of modules in them to exploit the usage of predefined implementations in these modules. This provides functionality similar to multiple inheritance avoiding any ambiguities.

The below code in Ruby demonstrates two modules Horse and Bird, each of which contain the talk method defined in them. The Pegasus class includes these two modules and defines its own talk method. This example was chosen to exhibit on how Ruby resolves method calls at run time.


module Horse
     def talk
         puts "I can Neigh !!!!! "
     end
 end
 module Bird
     def talk
         puts "I can Chirp !!!!!"
     end
 end
 class Pegasus
     include Horse
     include Bird
        def talk
            puts "I am a whinnying horse who can fly and chirp ! "
        end
 end
 Aviary = Pegasus.new 
 Aviary.talk         

When the method talk is called on the Aviary object, Ruby makes sure that the latest method defined in the hierarchy is chosen. Here in this example, the talk method in the Pegasus class is defined the latest and hence “I am a whinnying horse who can fly and chirp !” is printed. If there was no method defined in the class, method from the Bird is printed since it is next in the hierarchy specified by the include statements. This way Ruby prevents any naming conflicts by always executing the most recently defined method.

Advantages

  • You can separate object characteristics into nonoverlapping sets.
  • Lets you create complex classes using only the characteristics that you need, without a proliferation of base classes.

Disadvantages

  • Complicates method dispatch and impose additional requirements on an application.
  • It is essential to be aware of dependencies on subclass-superclass ordering, particularly in method selection and slot initialization.

Extending Objects

Ruby can act as a set-based language, using include augments a class definition, and hence adds functionality to all objects of a class. But, Ruby can also act as a prototype-based language. The dynamic nature of Ruby can be truly harnessed in a dynamic execution environment. One of the most powerful features is ability to extend (provide additional features to) any object at runtime. This is particularly helpful if we don’t want to modify the object’s class (be it Object for example) but the object itself only.

The following example explains this concept We define a class Student, this class is for all students.

  class Student
       attr_reader :name, :age
       def initialize(name, age)
           @name, @age, = name, age
       end
       def gets_paycheck?
           false
       end
   end

We create 2 objects of this class, (mihir and deepak), the basic idea is that students are not paid and thus the function gets_paycheck returns false for both the objects.

   mihir = Student.new('Mihir Surani', 22)
   deepak = Student.new('Deepak Anand', 25)
   mihir.gets_paycheck?    #=> false
   deepak.gets_paycheck?   #=> false

Now, Deepak became a TA, so he should receive a paycheck. We do not want modify the class Student since not all students receive paychecks. Thus, we create a new module TA for students who are TAs. The method gets_paycheck of TA returns true. Also, Deepak will now extend the module TA.

   module TA
       attr_accessor :course   
       def gets_paycheck?
           true
       end
   end
   deepak.extend(TA)
   deepak.course = "OOLS"
   mihir.gets_paycheck?    #=> false
   deepak.gets_paycheck?   #=> true
   mihir.inspect     #=> "#<Student:0x2401e90 @name=\"Mihir Surani\", @age=22>"
   deepak.inspect    #=> "#<Student:0x2c8b390 @name=\"Deepak Anand\", @age=25, @course=\"OOLS\">"


Conclusion

The above wiki-chapter covers all the topics which were introduced as part of Lecture 6. We began by discussing the topic of Regular Expressions and how ruby uses them. We then drifted our interests towards one of the main topics in Ruby: Modules and Mixins. We discussed the implementation details of modules and showed how modules can be mixed in a class and what advantages it provides.

Finally , we introduced the concept of Multiple Inheritance and its advantages and disadvantages. We also discussed how mixins avoid the diamond problem caused by Multiple inheritance.

We then conclude with the topic of extending objects: how and why is it done.

References

<references/>

See Also

1. http://ruby.about.com/od/beginningruby/a/mixin.html
2. https://www.re-motion.org/blogs/team/2008/02/20/introducing-mixins-finally/
3. http://pg-server.csc.ncsu.edu/mediawiki/index.php?title=CSC/ECE_517_Fall_2010/ch3_3b_sv&printable=yes
4. http://csis.pace.edu/~bergin/patterns/multipleinheritance.html