CSC/ECE 517 Spring 2014/ch1 1w1d mm
Serialization
Serialization[1] is a process of converting a data structure or an object into a stream of bytes or string to facilitate storage in memory, file(persistence storage) or transmission over a network. The process of Serialization is also referred to as Marshalling[2]. The stream of data has to be in a format that can be understood by both ends of a communication channel so that the object can be marshaled and reconstructed easily.
Basic Advantages of Serialization:
1. Communication between two or more processes on same machine. Object state can be saved and shared in a persistent or in-memory store.
2. Communication between processes on different machines. Serialization facilitates the transmission of an object through a network.
3. Creating a clone of an object.
4. Cross-platform compatibility. Object can be serialized in a common format that is understood by multiple platforms. Eg. JSON, XML.
De-serialization is the process of converting the stream of bytes or string back to objects in memory. It is the process of reconstructing the object later.This process of de-serialization is also referred to as Unmarshalling.
Few Practical Applications for Serialization
1. HTTP Session Replication by sharing session objects across web servers for handling failover scenarios
2. Serialization facilitates communication in Remote Method Invocation or Remote procedure calls
3. Rails Cookie Handling. Cookies are stored marshalled/unmarshalled to and from client machines.
Serialization in Ruby:
Ruby supplies serialization capabilities through its module, Marshal. There are also some other libraries like YAML and JSON which can be used in Ruby to generate serialization for purposes like platform independence and human readable formats.
Types of Serialization
Serialization in Ruby can be done in two ways. During serialization, the object in memory can be converted into Human Readable formats like YAML (YAML Ain’t Markup Language) and JSON (JavaScript Object Notation), or the object can be converted into binary format.
Converting Ruby Objects in Human Readable Formats
The conversion of Ruby objects into YAML and JSON formats are explained below.
Converting Ruby Objects to YAML format
YAML[3] format is a human friendly data serialization standard for all programming languages. YAML (YAML Ain't Markup Language) is perhaps the most common form of serialization in Ruby applications. It is used for configuration files in Rails and other projects, and is nearly ubiquitous. YAML is a plaintext format, as opposed to Marshal's[4] binary format. Immediately, this makes things easier. Objects stored as YAML are completely transparent and editable with nothing more than a text editor. It also has a simple, spartan syntax that's easy to look at and easy to type. It is not encumbered by excessive wordage and symbols seen in XML. Any Ruby object can easily be serialized into YAML format. Let us consider the below code,<
require "yaml" class First def initialize(name, age, country) @name = name @age = age @country=country end def to_s "In First:\n#{@name}, #{@age}, #{@country}\n" end end class Second def initialize(address, details) @address = address @details = details end def to_s "In Second:\n#{@details.to_s}#{@address}\n" end end x = First.new("Tom", 25, "USA") y = Second.new("St. Marks Street", x) puts y
We get the string representation of the object tree(object hierarchy) as the Output (because we have used the function to_s[5]).
Output:
In Second:
In First:
Tom, 25, USA
St. Marks Street
We use the below code to serialize out object tree.
serialized_object = YAML::dump(y)
puts serialized_object
The dump function serializes the object tree and stores the data in the YAML format in the variable serialized_object.
Data in the serialized (YAML) format looks like this:
--- !ruby/object:Second
address: St. Marks Street
details: !ruby/object:First
name: Tom
age: 25
country: USA
Now, to de-serialize the data, we use load function.
puts YAML::load(serialized_object)
The data is converted back to Ruby object tree.
Output:
In Second:
In First:
Tom, 25, USA
St. Marks Street
Thus we get back our original Object tree.
Converting Ruby Objects to JSON format:
JSON[6] is a light-weight data interchange format. JSON is typically generated by web applications and can be quite daunting, with deep hierarchies that are difficult to navigate. Any Ruby object can easily be serialized into JSON format. On Ruby 1.8.7, you'll need to install a gem. However, in Ruby 1.9.2, the json gem is bundled with the core Ruby distribution. So, if you're using 1.9.2, you're probably all set. If you're on 1.8.7, you'll need to install a gem.[7] The JSON library can be installed using Ruby Gems[8] like shown below:
# gem install json
We can create a JSON string for serialization by using the JSON.generate method as below:
require 'json'
my_hash = {:Welcome => "Ruby"}
puts JSON.generate(my_hash) => "{\"WELCOME\":\"RUBY\"}"
Output:
{"{\"Welcome\":\"Ruby\"}"=>"{\"WELCOME\":\"RUBY\"}"}
We can parse the JSON string received from another program by using JSON.parse Ruby thus converts String to Hash.
require 'json'
my_hash = JSON.parse('{"Welcome": "Ruby"}')
puts my_hash["Welcome"] => "Ruby"
Converting Ruby Objects to Binary Formats
Binary Serialization is another form of serialization in Ruby which is not in human readable form. It is similar to YAML Serialization. Binary Serialization is done using Marshal[9]. Binary Serialization is used when high performance serialization and de-serialization process is required and when the contents are not required to be in readable format.
Since the Binary Serialized data is not in human readable form, there are two essential guidelines that need to be followed. They are :
1.Use print[10] instead of puts[11] when serialized objects are written to a file in order to avoid new line characters to be written in the file. 2.Use a record separator in order to differentiate between two objects.
Binary Serialization Example
class Animal
def initialize name, age
@name = name
@age=age
puts "#{self.class.name}"
end
end
class Cat < Animal
def to_s
"In Cat C: #{@name} \t #{@age}"
end
end
class Dog < Animal
def to_s
puts "In Dog D: #{@name} \t #{@age}"
end
end
d = Dog.new("Doggy Dig", 4)
c = Cat.new("Kitty Kat",5)
puts "Before Serialization"
puts c
puts d
serialize_cat= Marshal.dump(c) #dumps the serialized cat object into serialize_cat
serialize_dog= Marshal.dump(d) #dumps the serialized dog object into serialize_dog
deserialize_cat= Marshal::load(serialize_cat) #deserializes the cat object and loads it back into deserialize_cat
deserialize_dog= Marshal::load(serialize_dog) #deserializes the dog object and loads it back into deserialize_dog
puts "After Serialization #{deserialize_cat}"
puts "After Dog Serialization #{deserialize_dog}"
Output
Before Serialization
In Cat C: Kitty Kat 5
In Dog D: Doggy Dig 4
After Serialization In Cat C: Kitty Kat 5
After Dog Serialization In Dog D: Doggy Dig 4
Serialization in OOLS Languages: Comparison
Sl.No | Ruby | Java | .Net Framework | C++ |
---|---|---|---|---|
1 | Ruby provides a built in module called Marshal for serialization | Java uses an Interface named Serializable interface for classes to implement | .Net provides a Serializable Attribute | Although, there is no built in support for serialization in C++, it can be achieved by using Boost libraries |
2 | The built in module of Ruby (Marshal) does not support platform independence, however, it can be achieved by using external libraries like YAML and JSON | Similarly, Java's built in serialization is also not platform independent and in order to use serialization in Java across Ruby platform, jruby library should be used. | .Net used Remoting technology to make it platform independent. | Serialization using the Boost libraries is not platform independent. |
3 | YAML provides a method (to_yaml_properties) which can be used to select the variables who's value is need to be serialized. With Marshal, we need to write a method named marshal_dump defining the variables of an object that has to be serialized. | Provides an option for serializing only the required attributes to be serialized for an object. Use the keyword Transient to ignore certain data that doesn’t need to be serialized | XML Serializer sets XmlIgnoreProperty to true to ignore the default serialization of a field or a property | Serialization using the Boost libraries is custom and thus the user can specify the part of the objects to be serialized. |
See Also
3. Article on Rails Serialization
References
2. YAML
3. JSON
4. Serializing and De-serializing in Ruby
5. Serialization and De-serialization
6. Marshal
7. Object Serialization Techniques
11.JSON