CSC/ECE 517 Spring 2014/ch1 1w1d mm

From Expertiza_Wiki
Revision as of 23:28, 9 February 2014 by Mmehta2 (talk | contribs)
Jump to navigation Jump to search

Serialization

Serialization[1] is a process of converting a data structure or an object into a stream of bytes or string to facilitate storage in memory, file(persistence storage) or transmission over a network. The process of Serialization is also referred to as Marshalling[2]. The stream of data has to be in a format that can be understood by both ends of a communication channel so that the object can be marshaled and reconstructed easily.

Basic Advantages of Serialization:

1. Communication between two or more processes on same machine. Object state can be saved and shared in a persistent or in-memory store.

2. Communication between processes on different machines. Serialization facilitates the transmission of an object through a network.

3. Creating a clone of an object.

4. Cross-platform compatibility. Object can be serialized in a common format that is understood by multiple platforms. Eg. JSON, XML.


De-serialization is the process of converting the stream of bytes or string back to objects in memory. It is the process of reconstructing the object later.This process of de-serialization is also referred to as Unmarshalling.


Practical Applications for Serialization

1. HTTP Session Replication by sharing session objects across web servers for handling failover scenarios

2. Serialization facilitates communication in Remote Method Invocation or Remote procedure calls

3. Rails Cookie Handling. Cookies are stored marshalled/unmarshalled to and from client machines.


Serialization in Ruby:

Let us consider a situation where two Ruby programs have to communicate with each other. One of the simplest way to do this is to convert the Ruby objects in the first programs into strings and writing these strings into a file. This is nothing but serialization. The second program can read this file and convert the strings back into Ruby objects. This is de-serialization.

Types of Serialization

Serialization in Ruby can be done in two ways. During serialization, the object in memory can be converted into Human Readable formats like YAML (YAML Ain’t Markup Language) and JSON (JavaScript Object Notation), or the object can be converted into binary format.

Converting Ruby Objects in Human Readable Formats

The conversion of Ruby objects into YAML and JSON formats are explained below.

Converting Ruby Objects to YAML format

YAML[3] format is a human friendly data serialization standard for all programming languages. YAML (YAML Ain't Markup Language) is perhaps the most common form of serialization in Ruby applications. It is used for configuration files in Rails and other projects, and is nearly ubiquitous. YAML is a plaintext format, as opposed to Marshal's[4] binary format. Immediately, this makes things easier. Objects stored as YAML are completely transparent and editable with nothing more than a text editor. It also has a simple, spartan syntax that's easy to look at and easy to type. It is not encumbered by excessive wordage and symbols seen in XML. Any Ruby object can easily be serialized into YAML format. Let us consider the below code,<

   require "yaml"
     class First
     def initialize(name, age, country)
	@name = name
	@age = age
	@country=country
     end

     def to_s
	"In First:\n#{@name}, #{@age}, #{@country}\n"
     end
   end

   class Second
     def initialize(address, details)
	@address = address
	@details = details
     end
 
     def to_s
	"In Second:\n#{@details.to_s}#{@address}\n"
     end
   end
 
  x = First.new("Tom", 25, "USA")
  y = Second.new("St. Marks Street", x)
  puts y

We get the string representation of the object tree(object hierarchy) as the Output (because we have used the function to_s[5]).

Output:

In Second:
In First:
Tom, 25, USA
St. Marks Street

We use the below code to serialize out object tree.

serialized_object = YAML::dump(y) puts serialized_object

The dump function serializes the object tree and stores the data in the YAML format in the variable serialized_object.

Data in the serialized (YAML) format looks like this:

--- !ruby/object:Second
address: St. Marks Street
details: !ruby/object:First
name: Tom
age: 25
country: USA

Now, to de-serialize the data, we use load function.

puts YAML::load(serialized_object)

The data is converted back to Ruby object tree.

Output:

In Second: In First: Tom, 25, USA St. Marks Street

Thus we get back our original Object tree.

Converting Ruby Objects to JSON format:

JSON[6] is a light-weight data interchange format. JSON is typically generated by web applications and can be quite daunting, with deep hierarchies that are difficult to navigate. Any Ruby object can easily be serialized into JSON format. On Ruby 1.8.7, you'll need to install a gem. However, in Ruby 1.9.2, the json gem is bundled with the core Ruby distribution. So, if you're using 1.9.2, you're probably all set. If you're on 1.8.7, you'll need to install a gem.[7] The JSON library can be installed using Ruby Gems[8] like shown below:

# gem install json

We can create a JSON string for serialization by using the JSON.generate method as below:

       require 'json'
       my_hash = {:Welcome => "Ruby"}
       puts JSON.generate(my_hash) => "{\"WELCOME\":\"RUBY\"}"

Output:

{"{\"Welcome\":\"Ruby\"}"=>"{\"WELCOME\":\"RUBY\"}"}

We can parse the JSON string received from another program by using JSON.parse Ruby thus converts String to Hash.

       require 'json'
       my_hash = JSON.parse('{"Welcome": "Ruby"}')
       puts my_hash["Welcome"] => "Ruby"

Converting Ruby Objects to Binary Formats

Binary Serialization is another form of serialization in Ruby which is not in human readable form. It is similar to YAML Serialization. Binary Serialization is done using Marshal[9]. Binary Serialization is used when high performance serialization and de-serialization process is required and when the contents are not required to be in readable format.

Since the Binary Serialized data is not in human readable form, there are two essential guidelines that need to be followed. They are :

    1.Use print[10] instead of puts[11] when serialized objects are written to a file in order to avoid new line characters to be written 
      in the file.
    
    2.Use a record separator in order to differentiate between two objects.

Binary Serialization Example

   class Animal
    def initialize  name, age
      @name = name
      @age=age
      puts "#{self.class.name}"
    end
   end
   class Cat < Animal
    def to_s
     "In Cat C: #{@name} \t #{@age}"
    end
   end
   class Dog < Animal
    def to_s
     puts "In Dog D: #{@name} \t #{@age}"
    end
   end
  d = Dog.new("Doggy Dig", 4)
  c = Cat.new("Kitty Kat",5)
  puts "Before Serialization"
  puts c
  puts d
  serialize_cat= Marshal.dump(c) #dumps the serialized cat object into serialize_cat
  serialize_dog= Marshal.dump(d) #dumps the serialized dog object into serialize_dog
  deserialize_cat= Marshal::load(serialize_cat) #deserializes the cat object and loads it back into deserialize_cat
  deserialize_dog= Marshal::load(serialize_dog) #deserializes the dog object and loads it back into deserialize_dog
  puts "After Serialization #{deserialize_cat}"
  puts "After Dog Serialization #{deserialize_dog}"

Output

  Before Serialization
  In Cat C: Kitty Kat 	 5
  In Dog D: Doggy Dig 	 4
  After Serialization In Cat C: Kitty Kat 	 5
  After Dog Serialization In Dog D: Doggy Dig 	 4

Serialization in OOLS Languages: Comparison

Sl.No Ruby Java .Net Framework C++
1 Ruby provides a built in module called Marshal for serialization Java uses an Interface named Serializable interface for classes to implement .Net provides a Serializable Attribute Although, there is no built in support for serialization in C++, it can be achieved by using Boost libraries
2 The built in module of Ruby (Marshal) does not support platform independence, however, it can be achieved by using external libraries like YAML and JSON Similarly, Java's built in serialization is also not platform independent and in order to use serialization in Java across Ruby platform, jruby library should be used. .Net used Remoting technology to make it platform independent. Serialization using the Boost libraries is not platform independent.
3 YAML provides a method (to_yaml_properties) which can be used to select the variables who's value is need to be serialized. With Marshal, we need to write a method named marshal_dump defining the variables of an object that has to be serialized. Provides an option for serializing only the required attributes to be serialized for an object. Use the keyword Transient to ignore certain data that doesn’t need to be serialized XML Serializer sets XmlIgnoreProperty to true to ignore the default serialization of a field or a property Serialization using the Boost libraries is custom and thus the user can specify the part of the objects to be serialized.

See Also

1. Serialization in Ruby JSON

2. Serialization in Rails

3. Article on Rails Serialization

References

1. Serilization in General

2. YAML

3. JSON

4. Serializing and De-serializing in Ruby

5. Serialization and De-serialization

6. Marshal

7. Object Serialization Techniques

8. XML Serialization

9. Java Serialization

10.Serialization in .Net

11.JSON

12.Serialization in Ruby YAML

13.Installing JSON gem