CSC/ECE 517 Summer 2008/wiki1 2 itr

From Expertiza_Wiki
Jump to navigation Jump to search

Introduction

One of the beloved feature of Ruby is the block based Iterator. A Ruby Iterator is simply a method that loops over the contents of an object without exposing its underlying representation. The verb `iterate' means "do the same thing many times' so `iterator' means "one which does the same thing many times'. It can also be considered as an object that behaves like a generic pointer. The iterator usually reference to one particular element in the object collection and then modify itself so that it points to the next element. Generators are a similar feature in Python. The name came as it is the entity which generate iterators. It allows you to write a function that can return a result and pause, resuming in the same place the next time you call the function. The generator feature in Ruby can be implemented by adding a library class called Generator or external iterator. In python the generator is a feature and part of the language.

Problem Definition

Ruby, like Java, has iterators to facilitate doing operations on each member of a set. Python has generators as well. Describe how generators differ from iterators, and find examples of code sequences using generators that would be awkward with iterators.

Iterator

The word "iterator" means different things in different contexts and programming languages, but it's always got something to do with visiting each one of a set of objects, where "object" doesn't necessarily mean "class instance": Just take "object" to mean "some instance of some data type, somewhere". Iterators may provide additional features or behaves in a different way depending on the languages.

Implementing Iterator

Most of the OOP languages provide ways to make iterations easy, for example some languages provide class controlling iteration, etc. But Ruby allows the definition of control structures directly. In terms of ruby, such user-defined control structures are called iterators.

Iterator in Ruby

Iterator in Ruby is simply a method that invokes a block of code. The power is in the code block between the do and end keywords or {...}. Meaning, we can put as much, or little, code in there as needed. And each item being iterated over is passed into the block as a parameter between the pipes. Examples of different iterators are given below.

Using Each [1]
a = [ 1, 2, 3 ]
a.each { |x| print x }
==>123
#The internal implementation of the Array.each method could be defined internally like this:
# def each
# for i in 0...size
#   yield(self[i])
#  end
#  end
  

x is the local variable in which each value of a is stored. And, each is probably the simplest iterator which yield successive elements of its collection.The above two line code is translated into the four line code in C.

s=[1,2,3];i=0
while i<s.length
   printf "<%i>", s[i]; i+=1
end; print "\n"
==><1><2><3>
Using Find [2]
a = [ 10,20,30 ]
a.find { |n| n % 5 == 0 }
==>10

The find iterator method in ruby will compare each element using some comparison operator (<, >, ==, etc.) and based on the boolean result (true or false), it will return the first matching value.

Using Collect [3]
a = [ 1, 2, 3, 4, 5 ]
b = a.collect { |n| n + 1 }
==>[2, 3, 4, 5, 6]

Another common iterator is the collect that returns an array of elements that is taken from the corresponding collections. Iterator can return derived values and not only limited to accessing the data stored in arrays and hashes.

Iterator in Python

Python also supports iteration over containers. It is implicitly used in the for statement, in list comprehensions, and in generator expressions. An iteration protocol is defined so that iteration is possible over different objects. Most of the container objects can be looped over using for statement. Example of a typical implicit iteration over a sequence is given below [4].

for element in [1, 2, 3]:
   print element

The actual internal implementation is that the for statement calls iter() on the container object. The function returns an iterator object that defines the method next(). The method next() then returns the next item one at a time. When there are no more elements, next() raises a StopIteration exception which tells the for loop to terminate. Iterators can also be defined explicitly. An example using explicit iterators is given below [5]

it = iter(sequence)
while True:
   try:
       value = it.next()
   except StopIteration:
       break
   print value

Python defines several iterator objects to support iteration over general and specific sequence types, dictionaries, and other more specialized forms. The iterator object just have to implement --next()-- and next(). Ruby iterator has still an advantage, by supporting code blocks as objects. In Python, you can only use the for loop for iteration.

Uses

  1. The main use of iterator is it hides the internal details from the user when manipulating with the objects.
  2. Iterator is trivially more cleaner and elegant which makes it more easy to maintain compared to the accessing of elements based on indexing.
  3. Iterators follow a consistent way of iterating through all kinds of data structures, as a result it becomes more readable and reusable.
  4. Using Iterators, inserting of a new element into the container object is easy even after the iterator has advanced beyond the first element. On the other hand, it is more difficult when using indexing as it requires changing of index numbers.

Generators

A generator looks like a function but behaves like an iterator. Both Ruby and Python support generators. As mentioned in the introduction, Ruby has a Generator library and Python has generator as a fundamental part of the language. As being a language feature it is more robust and syntactically more concise and cleaner. On the other hand, using a library class helps to learn a language more easily and more elegant with the less number of features.

Generators in Python

Generators are a simple and powerful tool for creating iterators. They are written like regular functions and called only once. It then returns an iterator which is the actual iterator with _iter()_ and next() methods. Generator don't have to worry about the iterator protocol (__next()__,.next()..), it just works. Yield statement is used whenever they want to return data. Each time next() is called, the generator resumes where it left-off (it remembers all the data values and which statement was last executed). The generator function does NOT run to completion when it's first called - instead, it only runs until it has a value available to return, at which point it yields that value back and suspends operation until called again to resume. This is described with an example below [6].

>>> def generator1():
...     yield "first"
...     yield "second"
...     yield "third"
>>> gen = generator1()
>>> gen
>>>            #No output was produced. 
>>> gen.next()
    'first'    #Function starts executing here.
>>> gen.next()
    'second'
>>> gen.next()
    'third'
>>> gen.next()
   Traceback (most recent call last):
   File "", line 1, in ?
   StopIteration

First, a generator called generator1 is defined which will yield three values, the strings "first", "second" and "third". When we create a new generator object (gen) it begins the function. Each time you call the generator's next method, it continues the function until the next yield. When the generator reaches the end of the block, or reaches a return statement, it throws a StopIteration exception. A generator is a one time operation. So, the generated data is iterated only once but one can call the generated function again if needed. In this way, Lazy evaluation can be achieved which increases the performance by eliminating unnecessary calculation of values that are never used.

Generators in Ruby

The internal iterator in Ruby would fail if the iterator is required to pass through a method that needs access on each of the values returned by that iterator. They cannot iterate over more than one collection at a time, they cannot be paused or stopped in the middle and they can only implement a single traversal strategy. This is where Generator library comes into play where external iterators are implemented. Example with the library to iterate over a block is given below.

require 'generator'
gen = Generator.new do |result|
result.yield " Start"
3.times { |i| result.yield i}
result.yield "done"
end
while gen.next?
print gen.next,"--"
end
==> Start--0--1--2--done--

Comparisons

An example of using generator in Python to give the successive elements from 10 to 20

def countfrom(n):
   while True:
       yield n
       n += 1
for i in countfrom(10):
   if i <= 20:
       print i
   else:
       break

Note that this iteration terminates normally [7]

An example of using iterators in Ruby to give the successive elements from 10 to 20

a = [ 9,10,11,12,13,14,15,16,17,18,19 ]
b = a.collect { |n| n + 1 }
==>[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

This is not very elegant

An example of using generator in Ruby to give the successive elements from 10 to 20

require 'generator'
gen = Generator.new(10..20)
while gen.next?
  print gen.next+1, ", "
end
==>11, 12, 13, 14, 15, 17, 17, 18, 19, 20, 21,

With Ruby generators the code flows similar to Python,

Comparisons of Ruby generators and Python generators

Python was intended to be a highly readable language. Ruby's internal iterators aren't always the best soltuion. Adding the generator or external iterators funtion helps get over the difficulty of writing the code and helps with the readablility. Ruby code runs slower than many compiled languages and other major scripting languages such as Python. Omission of parentheses around method arguments may lead to unexpected results if the methods take multiple parameters.

Where Python and Ruby has differences

Ruby has a data type called Range; an object of this type constitutes a set of everything between the start and end of the range (including or not including the limits, depending on additional conditions). Since Range is a subclass of Enumerable, one would intuitively expect that iterating a valid Range object will give you every single object in that set, from start to end. This expectation turns out to be incorrect:[8]

irb(main):001:0> (1..3).each { |i| puts i }
1
2
3
=> 1..3
irb(main):002:0> (3..1).each { |i| puts i }
=> 3..1
irb(main):003:0>

In our previous exampl e going from 20 to 10 in Ruby may be difficult.