CSC/ECE 517 Fall 2014/ch1a 25 ks

From Expertiza_Wiki
Revision as of 04:05, 18 September 2014 by Sdrangne (talk | contribs)
Jump to navigation Jump to search

Object Relational Mapping

Introduction

Object Relational Mapping (ORM, O/RM, and O/R mapping) in computer science is a programming technique for converting data between incompatible type systems in object-oriented programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.

Overview

Onject Relational Mapping is a programming technique in which a metadata descriptor is used to connect object code to a relational database. ORM converts data between type systems that are unable to coexist within relational databases and OOP languages.

In object-oriented programming, data management tasks act on object-oriented (OO) objects that are almost always non-scalar values. For example, consider an address book entry that represents a single person along with zero or more phone numbers and zero or more addresses. This could be modeled in an object-oriented implementation by a "Person object" with attributes/fields to hold each data item that the entry comprises: the person's name, a list of phone numbers, and a list of addresses. The list of phone numbers would itself contain "PhoneNumber objects" and so on. The address book entry is treated as a single object by the programming language (it can be referenced by a single variable containing a pointer to the object, for instance). Various methods can be associated with the object, such as a method to return the preferred phone number, the home address, and so on.

The heart of the problem is translating the logical representation of the objects into an atomized form that is capable of being stored in the database, while preserving the properties of the objects and their relationships so that they can be reloaded as objects when needed. If this storage and retrieval functionality is implemented, the objects are said to be persistent.

Simple Explanation

A simple answer is that you wrap your tables or stored procedures in classes in your programming language, so that instead of writing SQL statements to interact with your database, you use methods and properties of objects.

In other words, instead of something like this:

String sql = "SELECT ... FROM persons WHERE id = 10"
DbCommand cmd = new DbCommand(connection, sql);
Result res = cmd.Execute();
String name = res[0]["FIRST_NAME"];

you do something like this:

Person p = repository.GetPerson(10);
String name = p.FirstName;

or similar code (lots of variations here.) Some frameworks also put a lot of the code in as static methods on the classes themselves, which means you could do something like this instead:

Person p = Person.Get(10);

Some also implement complex query systems, so you could do this:

Person p = Person.Get(Person.Properties.Id == 10);

The framework is what makes this code possible.

Features

Object-relational Mapping (ORM) frameworks unburden the designer of the complex translation between database and object space.

Typical ORM features:

  • Automatic mapping from classes to database tables
    • Class instance variables to database columns
    • Class instances to table rows
  • Aggregation and association relationships between mapped classes are managed
    • Example, :has_many, :belongs_to associations in ActiveRecord
    • Inheritance cases are mapped to tables
  • Validation of data prior to table storage
  • Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records
  • Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)

The diagram below depicts a simple mapping of an object to a database table….

ORM framework is used to persist model objects to a relational database and retrieve them, and the ORM framework will take care of converting the data between the two otherwise incompatible states. Most ORM tools rely heavily on metadata about both the database and objects, so that the objects need to know nothing about the database and the database doesn’t need to know anything about how the data is structured in the application. ORM provides a clean separation of concerns in a well-designed data application, and the database and application can each work with data in its native form. Database rows map to objects, thus making the program more easily accessible and allowing the usage of information in a way that is internally consistent and easy to understand. The ORM gives the programmer an ability to manipulate data with the programming language, instead of having to manipulate each attribute as its data type as obtained by the database management system.

When applied to Ruby, implementations of ORM often leverage the language’s metaprogramming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used.

With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needed to be extracted from the model and the database, to be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.

Advantages

In addition to the data access technique, ORM's benefits also include:

  • First of all, you hide the SQL away from your logic code. This has the benefit of allowing you to more easily support more database engines. For instance, MS SQL Server and Oracle has different names on typical functions, and different ways to do calculations with dates, so a query to "get me all persons edited the last 24 hours" might entail different SQL syntax just for those two database engines. This difference can be put away from your logic code.
  • Additionally, you can focus on writing the logic, instead of getting all the SQL right. The code will typically be more readable as well, since it doesn't contain all the "plumbing" necessary to talk to the database.
  • Simplified development because it automates object-to-table and table-to-object conversion, resulting in lower development and maintenance costs
  • Less code compared to embedded SQL and handwritten stored procedures
  • Transparent object caching in the application tier, improving system performance
  • An optimized solution making an application faster and easier to maintain

Disadvantages

  • ORM’s emergence in multiple application development has created disagreement among experts. Key concerns are that ORM does not perform well and that stored procedures might be a better solution.
  • In addition, ORM dependence may result in poorly-designed databases in certain circumstances.
  • Performance – like every "proxy" technology
  • Complexity – learning curve
  • Difficulty / inability to make complex queries

Active Records

Active Record was described by Martin Fowler in his book Patterns of Enterprise Application Architecture. Active Record is the M in MVC- the model - which is the layer of the system responsible for representing business data and logic. In Active Record, objects carry both persistent data and behavior which operates on that data. A database table or view is wrapped into a class and an object instance is tied to a single row in the table. Each Active Record object has CRUD (Create, Read, Update, and Delete) methods for database access. In addition, Active Record allows you to validate the state of a model before it gets written into the database.

Active Record gives us several mechanisms, the most important being the ability to:

  • Represent models and their data.
  • Represent associations between these models.
  • Represent inheritance hierarchies through related models.
  • Validate models before they get persisted to the database.
  • Perform database operations in an object-oriented fashion.

The naming conventions used in Active Records are :

  • Database Table - Plural with underscores separating words (e.g., book_clubs).
  • Model Class - Singular with the first letter of each word capitalized (e.g., BookClub).

To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled. As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:

class CreateUsers < ActiveRecord::Migration
  def self.up
	create_table :users do |t|
  	t.string :name
  	t.string :email
  	t.string :age
            	  t.references :cheer
            	  t.references :post
 
  	t.timestamps 	# add creation and modification timestamps
	end
  end
 
  def self.down    	# undo the table creation
	drop_table :users
  end
end

The ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method. In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.

ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition. For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator. Also note ActiveRecord’s integrated support for validation of table information when attempting an update.

class User < ActiveRecord::Base
  has_many :cheers
  has_many :posts
 
  validates_presence_of :name
  validates_presence_of :age
  validates_uniqueness_of :name
  validates_length_of :name, :within => 3..20
end

To access the data in the table one calls the class function related to that column. For example, to find the user who's name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes. While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.

Pros -

  • Integrated with popular Rails development framework
  • Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent
  • DB creation/management using the migrate scheme provides a means for backing out unwanted table changes

Cons -

  • DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.

ORM Adaptor

ORM Adaptors provide a single point of entry for popular ruby ORMs. Its target audience is gem authors who want to support more than one ORM. ORM Adapter's goal is to support a minimum API used by most of the plugins that needs agnosticism beyond Active Model. ORM Adapter will support only basic methods, as get, find_first, create! and so forth. It is not ORM Adapter's goal to support different query constructions, handle table joins, etc. ORM adapter provides a consistent API for these basic class or 'factory' methods. It does not attempt to unify the behaviour of model instances returned by these methods. This means that unifying the behaviour of methods such as `model.save`, and `model.valid?` is beyond the scope of orm_adapter. If you need complex queries, it is recommended to subclass ORM Adapters in your plugin and extend it expressing these query conditions as part of your domain logic.

Example :
require 'orm_adapter'
User                                   	        # is it an ActiveRecord, DM Resource, MongoMapper or MongoId Document?
User.to_adapter.find_first :name => 'Fred'     	# we don't care!
user_model = User.to_adapter
user_model.get!(1)                              # find a record by id
user_model.find_first(:name => 'fred')          # find first fred
user_model.find_first(:level => 'awesome', :id => 23)   # find user 23, only if it's level is awesome
user_model.find_all                             # find all users
user_model.find_all(:name => 'fred')            # find all freds
user_model.find_all(:order => :name)            # find all freds, ordered by name
user_model.create!(:name => 'fred')             # create a fred
user_model.destroy(object)                    	# destroy the user object

Sequel

Sequel was originally developed by Sharon Rosner and the first release was in March 2007. It is based on the active record pattern. Sequel and Active Record share a lot of common features , for example association and inheritance. But Sequel handles these features in a much more flexible manner. Currently Sequel is at version 3.44.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.

class CreateUser < Sequel::Migration
  def up
    create_table(:user) {
      primary_key :id
      String :name
      String :age
      String :email}
  end
  def down
    drop_table(:user)
  end
end # CreateUser.apply(DB, :up)

Sequel supports associations and validations similar to Active Record. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute name.

class User < Sequel::Model
  one_to_many :cheers
  one_to_many :posts
  
  validates_presence [:name, :age]
  validates_unique(:name)
  validates_length_range 3..20, :name
end

To access the data in Sequel, one can use where clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.

Some of the key features of sequel are,

  • Connection Pooling
  • Thread Safety
  • Eager Loading / Lazy Loading
  • Model Caching
  • Supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.

Data Mapper

DataMapper is an Object Relational Mapper written in Ruby developed by Sam Smoot with the goal to create an ORM which is fast, thread-safe and feature rich. A Data Mapper is a Data Access Layer that performs bidirectional transfer of data between a persistent data store (often a relational database) and an in memory data representation (the domain layer). The goal of the pattern is to keep the in memory representation and the persistent data store independent of each other and the data mapper itself. The layer is composed of one or more mappers (or Data Access Objects), performing the data transfer. DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and some web services. With DataMapper, you define your mappings in your model. Your data store can develop independently of your model using Migrations.

Some features of Data Mapper are as follows :

  • No need to write structural migrations
  • Scoped relations
  • Lazy loading on certain attribute types
  • Strategic eager loading to avoid (N+1) queries
  • Default support for composite and natural keys
  • Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation)
  • An API not too heavily oriented to SQL databases

DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB, Apache Solr, and webservices such as Salesforce.

Example of table definition in the model:

class User
  include DataMapper::Resource
 
  property :id,     	Serial	# key
  property :name,   	String, :required => true, :unique => true 	
  property :age,    	String, :required => true, :length => 3..20
  property :email,  	String
 
  has n, :posts      	# one to many association
  has n, :cheers      	# one to many association
end

Notice that in the example above , ":required = true" is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.

MyBatis/iBatis

MyBatis/iBatis was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.

The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.

SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:

package org.mybatis.example;
 
public interface BlogMapper {
    @Select("select * from Blog where id = #{id}")
    Blog selectBlog(int id);
}

The sentence is executed as follows.

BlogMapper mapper = session.getMapper(BlogMapper.class);
Blog blog = mapper.selectBlog(101);

It can also be executed using MyBatis API.

Blog blog = session.selectOne("org.mybatis.example.BlogMapper.selectBlog", 101);

SQL statements and mappings can also be externalized to an XML file like this.

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd">
 
<mapper namespace="org.mybatis.example.BlogMapper">
    <select id="selectBlog" parameterType="int" resultType="Blog">
        select * from Blog where id = #{id}
    </select>
</mapper>

Squeel

Squeelunlocks the power of Arel in Rails applications with a handy block-based syntax. It is supporting in Rails 3 and 4. With Squeel, you can write subqueries, access named functions provided by RDMBS and more without writing SQL strings. Squeel lets you write your Active Record queries with fewer strings, and more Ruby, by making the Arel awesomeness that lies beneath Active Record more accessible.

Squeel lets you rewrite...

Article.where ['created_at >= ?', 2.weeks.ago]
...as...
Article.where{created_at >= 2.weeks.ago}

Squeel enhances the normal Active Record query methods by enabling them to accept blocks. Inside a block, the Squeel query DSL can be used. Note the use of curly braces in the above example instead of parentheses. {} denotes a Squeel DSL query.

Stubs and keypaths are the two primary building blocks used in a Squeel DSL query.

Stubs are, for most intents and purposes, just like Symbols in a normal call to Relation#where (note the need for doubling up on the curly braces here, the first ones start the block, the second are the hash braces):

Person.where{{name => ‘Ernie’ }}

You normally wouldn't bother using the DSL in this case, as a simple hash would suffice. However, stubs serve as a building block for keypaths, and keypaths are very handy.

A Squeel keypath is essentially a more concise and readable alternative to a deeply nested hash. For instance, in standard Active Record, you might join several associations like this to perform a query:

Person.joins(:articles => { :comments => :person}).references(:all)

With a keypath, this would look like:

Person.joins{articles.comments.person}.references(:all)

The Squeel DSL works its magic using instance_eval which means that inside a Squeel DSL block, self isn't the same thing that it is outside the block. This carries with it an important implication: Instance variables and instance methods inside the block won't refer to your object's variables/methods. Use one of the following methods to get access to the object's methods and variables:

  • Assign the variable locally before the DSL block, and access it as you would normally.
  • Supply an arity to the DSL block, as in Person.where{|q| q.name == @my_name} Downside: You'll need to prefix stubs, keypaths, and functions with the DSL object.
  • Wrap the method or instance variable inside the block with my{}. Person.where{name == my{some_method_to_return_a_name}}

Merb

Merb, short for "Mongrel+Erb", is a model view controller framework written in Ruby. Merb was merged into Rails web framework on December 23, 2008 as part of the Ruby on Rails 3.0 release. Merb itself provides only the controller of an MVC model which can be extended to create a full-stack application environment. This effectively means Merb itself is not an ORM but can accommodate other ORMs. Some features of Merb are as follows :

  • Speed - Merb is ORM, JavaScript library and template language agnostic, preferring plugins that add support for features rather than producing a monolithic library with everything in the core.
  • Lightweight - Rather than trying to cram every feature into a single code, things are kept bare minimum without sacrificing anything important.
  • Powerful and extensible - For any features not covered in Merb’s core, there are plugins.

Some basic Command distinction between Ruby on Rails and Merb :

Action Ruby on Rails Merb
Create new application 'app1' rails new app1 merb-gen app app1
Start server rails server merb
Start cluster of 3 beginning at port 3000 N/A merb -p 3000 -c 3
Interactive console rails console merb -i

NonSQL databases

Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.

The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files.

Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.

NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene

Comparison of ORM Features

Features ActiveRecord Sequel DataMapper MyBatis

Databases

MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2 ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3 SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2

Migrations

Yes Yes Yes, but optional Yes, provides migration by Mybatis schema migration system.

EagerLoading

Supported by scanning the SQL fragments Supported using eager (preloading) and eager_graph (joins) Strategic Eager Loading and by using :summary Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag

Flexible Overriding

No. Overriding is done using alias methods. Using methods and by calling 'super' Using methods No

Dynamic Finders

Yes. Uses 'Method Missing' No. Alternative is to use <Model>.FindOrCreate(:name=>"John") Yes. Using the dm_ar_finders plugin Similar feature is supported in the form of Dynamic SQL

Further Reading

To get more information on ORM and the different type of ORM's available for Ruby , please look into the CSC/ECE 517 Spring 2013/ch1 1d zk and Fall 2010 detailed ORM explanation wiki pages.

Future Work

More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.

References

  1. Object Relational Mapping
  2. object-oriented programming
  3. data access
  4. metaprogramming
  5. Active Record
  6. MVC
  7. Sequel
  8. DataMapper
  9. Eager Loading
  10. Squeel
  11. ORM
  12. Merb
  13. Merb Website
  14. Object Relational Mapping Spring 2013