CSC/ECE 517 Fall 2013/ch1 1w43 sm

From Expertiza_Wiki
Jump to navigation Jump to search

Object Relational Mapping

Introduction

Object-relational mapping (ORM), or ORM, is a programming technique which associates data stored in a relational database with application objects. The ORM layer populates business objects on demand and persists them back into the relational database when updated. ORM basically creates a "virtual object database" that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to create their own ORM tools.In this wiki, we concenterate mainly on the top Ruby ORM's like Active Record,Sequel and DataMapper.Also, it includes a comparison of various ORM techniques and the advantages and disadvantages of various styles of ORM mapping.

Need For ORM

Need for ORM arose due to a major problem called the Object-Relational Impedance MisMatch Problem. Object-Relational Impedance MisMatch Problem is a set of conceptual and technical difficulties, encountered when a relational database management system (RDBMS) is being used by an object-oriented program particularly when object/class definitions are mapped directly to database tables or schema.It implies that object models don't work together with relational models, i.e RDBMS represents data in tabular format while Object oriented languages represent data in the form of interconnected graph of objects. The Mismatches that occur in Object-Oriented concepts are as follows:

  • Encapsulation:

Mapping of private object representation to database tables is difficult due to availability of fewer constraints for the design of private representation as opposed to public data in RDBMS.

  • Interface, Class, Inheritance and polymorphism:

Under an object-oriented paradigm, objects have interfaces that together provide the only access to the internals of that object. The relational model, on the other hand, utilizes derived relation variables (views) to provide varying perspectives and constraints to ensure integrity. Similarly, essential OOP concepts for classes of objects, inheritance and polymorphism are not supported by relational database systems.

  • Identity:

RDBMS defines exactly one notion of 'sameness': the primary key. However, Object-oriented languages, for example Java, defines both object identity (a==b) and object equality (a.equals(b)).

  • Associations:

Associations are represented as unidirectional references in Object Oriented languages whereas RDBMS uses the notion of foreign keys. For example, If you need bidirectional relationships in Java, you must define the association twice.Likewise, you cannot determine the multiplicity of a relationship by looking at the object domain model.

  • Data navigation:

The way you access data in object-oriented languages is fundamentally different than the way you do it in a relational database. Like in Java, you navigate from one association to another walking the object network. This is not an efficient way of retrieving data from a relational database. You typically want to minimize the number of SQL queries and thus load several entities via JOINs and select the targeted entities before you start walking the object network.

Overview

Data management tasks in object-oriented (OO) programming are typically implemented by manipulating objects that are almost always non-scalar values. Like for example, Consider a Geometric object, an entity which can have shape and color.In the object oriented implementation, this entity would be modeled as a geometric object with shape and color as its attributes.Shape itself can contain objects like for triangles(equilateral or isosceles) and so on.Here each geometric object can be treated as a single object by any programming language and its attributes can be also accessed by various object methods.However, many popular database products such as structured query language database management systems (SQL DBMS) can only store and manipulate scalar values such as integers and strings organized within tables.

Hence, the programmer must either convert the object values into groups of simpler values for storage in the database (and convert them back upon retrieval), or only use simple scalar values within the program. Object-relational mapping is used to implement the first approach.The key is to translate the logical representation of the objects into an atomized form that is capable of being stored in the database, while somehow preserving the properties of the objects and their relationships so that they can be reloaded as an object whenever required. If this storage and retrieval functionality is implemented, the objects are then said to be persistent.

Object-Relational Mapping, commonly referred to as its abbreviation ORM, is a technique that connects the rich objects of an application to tables in a relational database management system. Using ORM, the properties and relationships of the objects in an application can be easily stored and retrieved from a database without writing SQL statements directly and with less overall database access code.

Features

Typical ORM features:

  • Automatic mapping from classes to database tables
    • Class instance variables to database columns
    • Class instances to table rows
  • Aggregation and association relationships between mapped classes are managed
    • Example, :has_many, :belongs_to associations in ActiveRecord
    • Inheritance cases are mapped to tables
  • Validation of data prior to table storage
  • Class extensions to enable search, as well as creation, read, update, and deletion (CRUD) of instances/records
  • Usually abstracts the database from program space in such a way that alternate database types can be easily chosen (SQLite, Oracle, etc)

The diagram below depicts a simple mapping of an object to a database table

ORM makes access to database through the application extremely easy, as it provides connection from model of the program to the database.Treating database rows as objects, Program can easily access the information in more consistent manner.Also it gives flexibility to the programmers to manipulate the data with the language itself rather than working on the attributes retrieved from the database.

When applied to Ruby, implementations of ORM often leverage the language’s Meta-programming strengths to create intuitive application-specific methods and otherwise extend classes to support database functionality. With the addition of Rails, the ORM becomes much more important, as it is necessary to have an ORM to connect the models of the MVC (model-view-controller) stack used by Ruby on Rails with the application's database. Since the models are Ruby objects, the ORM allows modifications to the database to be done through changes to these models, independent of the type of database used. With the release of Rails 3.0, the platform became ORM independent. With this change, it is much easier to use the ORM most preferred by the programmer, rather than being corralled into using one particular one. To allow this, the ORM needs to be extracted from the model and the database, and be a pure mediator between the two. Then any ORM can be used as long as it can successfully understand the model and the database used.

Active Record

ActiveRecord insulates you from the need to use raw SQL to find database records.Active Record facilitates the creation and use of business objects whose data requires persistent storage to a database.

ActiveRecord is the default ‘model’ component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications. In both forms, it was conceived of by David Heinemeier Hansson, and has been improved upon by a number of contributors.

Other, less popular ORMs have been released since ActiveRecord first took the stage. For example, DataMapper and Sequel show major improvements over the original ActiveRecord framework.[neutrality is disputed] As a response to their release and adoption by the Rails community, Ruby on Rails v3.0 is independent of an ORM system, so Rails users can easily plug in DataMapper or Sequel to use as their ORM of choice.

To create a table, ActiveRecord makes use of a migration class rather than including table definition in the actual class being modeled. As an example, the following code creates a table ‘’users’’ to store a collection of class ‘’User’’ objects:

class CreateUsers < ActiveRecord::Migration
  def self.up
    create_table :users do |t|
      t.string :name
      t.string :email
      t.string :age
	  t.references :cheer
	  t.references :post

      t.timestamps     # add creation and modification timestamps
    end
  end

  def self.down        # undo the table creation
    drop_table :users
  end
end

Note in the preceding example, that the ActiveRecord’s migration scheme provides a means for backing out changes via the ‘’self.down’’ method. In addition, a table key ‘’id’’ is added to the new table without an explicit definition and record creation and modification timestamps are included via the ‘’timestamps’’ method provided by ActiveRecord.

ActiveRecord manages associations between table elements and provides the means to define such associations in the model definition. For example, the ‘’User’’ class shown below defines a ‘’has_many’’ (one-to-many) association with both the cheers and posts tables. These associations are represented in the migration class with the references operator. Also note ActiveRecord’s integrated support for validation of table information when attempting an update.

class User < ActiveRecord::Base
  has_many :cheers
  has_many :posts
  
  validates_presence_of :name
  validates_presence_of :age
  validates_uniqueness_of :name
  validates_length_of :name, :within => 3..20
end

To access the data in the table one calls the class function related to that column. For example, to find the user whose name is Bob, one would use @user = User.find_by_name('Bob'). Then one could find Bob's e-mail by using @user.email. Users can be searched for by any attribute in this way, as find methods are created for every combination of attributes. While ActiveRecord provides the flexibility to create more sophisticated table relationships to represent class hierarchy, its base scheme is a single table inheritance, which trades some storage efficiency for simplicity in the database design. The following figure illustrates this concept of simplicity over efficiency.

Active Record will perform queries on the database for you and is compatible with most database systems (MySQL, PostgreSQL and SQLite to name a few). Regardless of which database system you are using, the ActiveRecord method format will always be the same. To retrieve objects from the database, Active Record provides several finder methods. Each finder method allows you to pass arguments into it to perform certain queries on your database without writing raw SQL.

Pros

  • Integrated with popular Rails development framework.
  • Dynamically created database search methods (eg: User.find_by_address) ease db queries and make queries database syntax independent.
  • DB creation/management using the migrate scheme provides a means for backing out unwanted table changes.

Cons

  • DB creation/management is decoupled from the model, requiring a separate utility (rake/migrate) that must be kept in sync with application.

Sequel

Sequel is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application. Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.The current sequel version is 4.2.0. Initially Sequel had three core modules - sequel, sequel_core and sequel_model. Starting from version 1.4 , sequel and sequel_model were merged. Sequel handles validations using a validation plug-in and helpers.

Sequel::Model is an object relational mapper built on top of Sequel core. Each model class is backed by a dataset instance, and many dataset methods can be called directly on the class. Model datasets return rows as model instances, which have fairly standard ORM instance behavior.

Sequel::Model is built completely out of plugins. Plugins can override any class, instance, or dataset method defined by a previous plugin and call super to get the default behavior

Features -

  • Sequel provides thread safety, connection pooling and a concise DSL for constructing SQL *queries and table schemas.
  • Sequel includes a comprehensive ORM layer for mapping records to Ruby objects and handling associated records.
  • Sequel supports advanced database features such as prepared statements, bound variables, stored procedures, savepoints, two-phase commit, transaction isolation, master/slave configurations, and database sharding.
  • Sequel currently has adapters for ADO, Amalgalite, CUBRID, DataObjects, DB2, DBI, Firebird, IBM_DB, Informix, JDBC, MySQL, Mysql2, ODBC, OpenBase, Oracle, PostgreSQL, SQLite3, Swift, and TinyTDS.
class CreateUser < Sequel::Migration
  def up
    create_table(:user) {
      primary_key :id
      String :name
      String :age
      String :email}
  end
  def down
    drop_table(:user)
  end
end # CreateUser.apply(DB, :up)

Sequel supports associations and validations similar to ActiveRecord. The following example shows how validations and associations can be enforced in the User table that has been created above. It enforces one to many relationships between the user table and the cheers and posts tables. It also validates for the presence , uniqueness and the length of the attribute name.

class User < Sequel::Model
  one_to_many :cheers
  one_to_many :posts
  
  validates_presence [:name, :age]
  validates_unique(:name)
  validates_length_range 3..20, :name
end

To access the data in Sequel, one can use where clauses. For instance, to find Bob again one would use DB[:items].where(Sequel.like(:name, 'Bob'). While the ability to use SQL-like where clauses is quite flexible, it is not quite as pure an object-oriented approach as Active Record's dynamically created find methods.

DataMapper

DataMapper is an Object Relational Mapper written in Ruby. The goal is to create an ORM which is fast, thread-safe and feature rich. DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular web-services.

Features-

  • Eager loading of child associations to avoid (N+1) queries.
  • Lazy loading of select properties, e.g., larger fields.
  • Query chaining, and not evaluating the query until absolutely necessary (using a lazy array implementation).
  • An API not too heavily oriented to SQL databases.

DataMapper was designed to be a more abstract ORM, not strictly SQL, based on Martin Fowler's enterprise pattern. As a result, DataMapper adapters have been built for other non-SQL databases, such as CouchDB,Apache Solr and web services such as Salesforce.

Example of table definition in the model:

class User
  include DataMapper::Resource

  property :id,         Serial    # key
  property :name,       String, :required => true, :unique => true     
  property :age,        String, :required => true, :length => 3..20 
  property :email,      String 
  
  has n, :posts          # one to many association
  has n, :cheers          # one to many association
end

Notice that in the example above , ":required = true" is an example for Auto Validation. Unlike ActiveRecord and Sequel, DataMapper supports auto validations , i.e. these in turn call the validation helpers to enforce basic validations such as length, uniqueness, format, presence etc.

Squeel

Squeel unlocks the power of Arel in your Rails 3 application with a handy block-based syntax. You can write subqueries, access named functions provided by your RDBMS, and more, everything without writing any SQL strings. Squeel acts as an add-on to ActiveRecord and makes it easier to write queries with fewer strings. A simple example is as follows: Squeel lets you rewrite

Article.where ['created_at >= ?', 2.weeks.ago]

as

Article.where{created_at >= 2.weeks.ago}

MyBatis/iBatis

MyBatis/iBatis was a persistence framework which allowed easy access of the database from a Rails application without being a full ORM. It was created by the Apache Foundation in 2002, and is available for several platforms, including Ruby (the Ruby release is known as RBatis). On 6/16/2010, after releasing iBATIS 3.0, the project team moved from Apache to Google Code, changed the project's name to MyBatis, and stopped supporting Ruby.

The MyBatis data mapper framework makes it easier to use a relational database with object-oriented applications. MyBatis couples objects with stored procedures or SQL statements using a XML descriptor. To use the MyBatis data mapper,we make use of our own objects, XML, and SQL.

SQL statements are stored in XML files or annotations. Following is a MyBatis mapper, that consists of a Java interface with some MyBatis annotations:

package org.mybatis.example;
 
public interface BlogMapper {
    @Select("select * from Blog where id = #{id}")
    Blog selectBlog(int id);
}

The sentence is executed as follows.

BlogMapper mapper = session.getMapper(BlogMapper.class);
Blog blog = mapper.selectBlog(101);

It can also be executed using MyBatis API.

Blog blog = session.selectOne("org.mybatis.example.BlogMapper.selectBlog", 101);

SQL statements and mappings can also be externalized to an XML file like this.

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE mapper PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd">
 
<mapper namespace="org.mybatis.example.BlogMapper">
    <select id="selectBlog" parameterType="int" resultType="Blog">
        select * from Blog where id = #{id}
    </select>
</mapper>

Comparison of ORM Features

Features ActiveRecord Sequel DataMapper MyBatis

Databases

MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2 ADO, DataObjects, DB2, DBI, Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL and SQLite3 SQLite, MySQL, PostgreSQL, Oracle, MongoDB, SimpleDB, many others, including CouchDB, Apache Solr, Google Data API MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2

Migrations

Yes Yes Yes, but optional Yes, provides migration by Mybatis schema migration system.

EagerLoading

Supported by scanning the SQL fragments Supported using eager (preloading) and eager_graph (joins) Strategic Eager Loading and by using :summary Yes.this can be enabled or disabled by setting the lazyLoadingEnabled flag

Flexible Overriding

No. Overriding is done using alias methods. Using methods and by calling 'super' Using methods No

Dynamic Finders

Yes. Uses 'Method Missing' No. Alternative is to use <Model>.FindOrCreate(:name=>"John") Yes. Using the dm_ar_finders plugin Similar feature is supported in the form of Dynamic SQL

NonSQL databases

Instead of ORM we can also use Object-Oriented Database Management System (OODBMS) or document-oriented database like XML, where in databases are designed to store object oriented values, thus eliminating the need to convert data to and from its SQL form. Document oriented databases also eliminates the need to retrieve objects as data rows. Query languages like XQuery can be used to retrieve data sets.

The only issue is we won't be able to create application independent queries for retrieving data without restrictions to access path. Also OODBMS limits the extent of processing SQL queries. A Relational database allows concurrent access to data, locking, indexing (fast search), as well as many other features that are not available while using XML files.

Other OODBMS (such as RavenDB) provide replication to SQL databases, as a means of addressing the need for ad-hoc queries, while preserving the increased performance and reduced complexity that may be achieved with an OODBMS for an application that has well-known query patterns.

NonSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene

Advantages of ORM

  • Facilitates implementing the Domain Model pattern that allows you to model entities based on real business concepts rather than based on our database structure. ORM tools provide this functionality through mapping between the logical business model and the physical storage model.
  • Huge reduction in code. Ease of use, faster development, increased productivity.
  • ORM tools provide a host of services thereby allowing developers to focus on the business logic of the application rather than repetitive CRUD (Create Read Update Delete) logic.
  • Changes to the object model are made in one place. One you update your object definitions, the ORM will automatically use the updated structure for retrievals and updates. There are no SQL Update, Delete and Insert statements strewn throughout different layers of the application that need modification.
  • Rich query capability. ORM tools provide an object oriented query language. This allows application developers to focus on the object model and not to have to be concerned with the database structure or SQL semantics. The ORM tool itself will translate the query language into the appropriate syntax for the database.
  • Navigation. You can navigate object relationships transparently. Related objects are automatically loaded as needed. For example if you load a Post and you want to access it's Comment, you can simply access Post.Comment and the ORM will take care of loading the data for you without any effort on your part.
  • Data loads are completely configurable allowing you to load the data appropriate for each scenario. For example in one scenario you might want to load a list of objects without any of it's child / related objects, while in other scenarios you can specify to load an object, with all it's children etc.
  • Concurrency support. Support for multiple users updating the same data simultaneously.
  • Cache management. Entities are cached in memory thereby reducing load on the database.
  • Transaction management and Isolation. All object changes occur scoped to a transaction. The entire transaction can either be committed or rolled back. Multiple transactions can be active in memory in the same time, and each transactions changes are isolated form on another.
  • Making data access more abstract and portable. ORM implementation classes know how to write vendor-specific SQL, so you don't have to.

Disadvantages of ORM

  • High level of abstraction obscures as to what is actually happening in the code implementation.
  • Loss in developer productivity whilst they learn to program with ORM.Developers lose understanding of what the code is actually doing - the developer is more in control using SQL.
  • Heavy reliance on ORM also leads to poorly designed databases.
  • Data and behaviour are not separated.
    • Façade needs to be built if the data model is to be made available in a distributed architecture.
    • Each ORM technology/product has a different set of APIs and porting code between them is not easy.
  • The biggest advantage of ORM is also the biggest disadvantage: queries are generated automatically -
    • Queries can’t be optimized.
    • Queries select more data than needed, things get slower, more latency.
    • Compiling queries from ORM code is slow (ORM compiler written in PHP).
    • SQL is more powerful than ORM query languages.
    • Database abstraction forbids vendor specific optimizations.

Further Reading

To get more information on ORM and the different type of ORM's available for Ruby , please look into the CSC/ECE 517 Spring 2013/ch1 1d zk wiki page.

Future Work

More work can be done in the area of comparison between the different ORMs available for Ruby, especially a more detailed feature-by-feature comparison that includes performance differences between the ORMs.Also, a detailed study of alternatives to ORM like the NoSQL databases and Document-oriented approach can be included.

References

  1. Wikipedia , Object-relational mapping (ORM)
  2. Object-relational mapping
  3. Object-Relational Impedance Matching Problem
  4. The Object-Relational Impedance Mismatch
  5. Active Record
  6. ActiveRecord Query Interface
  7. DataMapper
  8. MyBatis/iBatis
  9. Advantages of ORM
  10. Disadvantages of ORM