CSC/ECE 517 Fall 2007/wiki2 2 22

From Expertiza_Wiki
Jump to navigation Jump to search

Object-relational mapping. Ruby's ActiveRecord is one attempt to allow an object-oriented program to use a relational database. The Crossing Chasms pattern is another. Look up several approaches to mapping relational databases to o-o programs, include hyperlinks to all of them, and explain how they differ. Report on the strengths of various approaches (making sure to credit the authors for their insights).

Introduction

What is ORM?

Object Relational Mapping (ORM) is a technique used in software systems that use an object-oriented language in their application domain and a relational database to store persistent data from the objects they use. The mapping is required due to the differences in representation. The database uses tables and foreign keys for its representation of data and relationships where the application uses objects and references and/or pointers. For some time proponents of Object-Oriented Database Management Systems (ODBMS) have argued that this is similar to breaking up a car to its pieces before storing it in the garage and putting it together again after taking it out. Whereas in an object database you store or retrieve the whole car at once.

How ORM works?

Typical ORM systems will map one row in a table in the database to an object in the application and vice versa. In static languages the mapping is done manually or it is automated with a preprocessor or with external mapping metadata (e.g. XML mapping files in Java Hibernate). In languages that support metaprogramming at runtime or compile time the mapping can be done automatically from information gathered from the database schema.

ORM systems typically provide the programmer with object factories that map class method calls to SQL queries that retrieve objects from the database. They also provide methods to store the object data back in the database after updates in the application. This ability of an ORM to allow the programmer to interact with the database by manipulating objects in the programming language is referred to as transparent persistence, because the database access is seamless to the programmer.

The animation in the following link illustrates how an ORM system works:

Transparent persistence in object-relational mapping

Design Patterns addressing ORM

Active Record

The Active Record pattern describes the most basic mapping of one object per table row, or as described by its author Martin Fowler: An object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data. [1]

Active Record is part of the pattern language presented by Fowler in the book Patterns of Enterprise Application Architecture. A summary of the other patterns in the language can be found here.

The best example of the implementation of the pattern is found in the Rails framework [2].

Crossing Chasms

Crossing Chasms is a large pattern language that as a whole addresses the issues faced by teams trying to build large client-server systems using Object Technology and Relational data stores.[3] The authors, Kyle Brown and Bruce Whitenack, originally presented the language at the Second Pattern Languages of Program Design conference [PLoPD], the proceedings of the conference were published as the book Pattern Languages of Program Design 2.

The pattern language includes patterns that deal with ORM from the architectural and behavioral points of view.

Currently there is little information available online. The best articles still available are:

ORM implementations

An extensive list of ORM systems divided by language targeted can be found here. A shorter representative list follows.

Rails Active Record

The Rails Active Record library implements Active Record and related patterns as a foundation to provide an extremely elegant and succinct ORM system. By using the power of metaprogramming available in Ruby, the system is able to gather information from the database and generates class methods automatically from the database schema.

Hibernate

Hibernate is a Java ORM and query service [4]. The current version of the system can be used using different APIs:

  • Native. The user uses XML metadata to help the Hibernate subsystem map table data and relationship to application classes and objects.
  • Java Persistence API. Hibernate implements the Java Persistence API [5] which defines a standard ORM interface for Java.

Hibernate supports inheritance, polymorphism, composition and the Java collections framework. It uses a dual-layer cache architecture to optimize performance.

Oracle TopLink

Oracle TopLink [6] is a proprietary ORM system developed by Oracle. It supports the Java Persistence API and provides extensions [7] for mapping, caching, logging, etc.

Apache ObJectRelationalBridge - OJB

OJB [8] allows Java Programmers to store and retrieve Java Objects in/from (any) JDBC-compliant RDBMS. Supports for Polymorphism and Extents. You can use Interface-types and abstract classes as attribute types in your persistent classes. Queries are also aware of extents: A query against a base class or interface will return matches from derived classes, even if they are mapped to different database tables. OJB uses an XML based Object/Relational mapping.

Database Template Library

The DTL [9] is a C++ ORM system that aims to provide a database abstraction similar to the Standard Template Library. The mapping is performed using macros and template metaprogramming.

SQLObject

SQLObject [10] is an ORM for Python. Like the Ruby On Rails' Active Record library it takes advantage of the reflection and dynamic characteristics of Python to generate class structures from the database schemas

Comparing ORM systems

The major difference between the pattern languages and the specific implementations is that the former attempt to address other issues besides how to map the data to objects and vice versa. For example Crossing Chasms includes a Table Design Pattern that addresses the question of when is the best time to design the database schema. This is an important consideration specially for systems that will evolve, since it is harder to make big changes to the database schema once the system is in use. The pattern solves the problem by requiring a first-pass object behavioral modeling. It makes sense because in a new system we want to store the state of our objects, not adapt our objects to a specific database schema.

Configuration is another difference between systems. While some languages support trivial configuration through their metaprogramming capabilities others require extensive configuration. This is very important consideration in regards to maintainability and learning curve. As far as flexibility and ease of use are concerned the ORM implementations for dynamic languages clearly win.

Despite of the shortcomings of some implementations, an ORM systems goes a long way about making application programming simpler and database agnostic.

Criteria for Selection

None of the ORM implementations is perfect for all jobs, hence the most important thing when searching for the right tool is to define precisely which criteria are essential. Following is the summary of criteria which can be considered while selecting an ORM system:

ORM specific criteria

Basic features:

  • Be able to use inheritance, create hierarchies between entities, and use polymorphism (we are using objects!). The tools can support a variety of combinations for tables and classes to allow these mechanisms.
  • Handle any type of relations (1-1, 1-n, n-n)
  • Support for transactions
  • Aggregates (equivalent to SQL's SUM, AVG, MIN, MAX, COUNT)
  • Support for grouping (SQL's GROUP BY)
  • Supported databases. A big advantage of mapping tools is that they provide an abstraction of the underlying database engine. Most of them allow switching easily between RDBMSs

Flexibility

  • Customization of queries. HQL is a strong point of Hibernate/NHibernate. We could also wish a dynamic mapping to be possible from developer provided SQL queries.
  • Support any type of SQL joins (inner join, outer join)
  • Be able to map a single object to data coming from multiple tables (joins, views). And Be able to dispatch the data from a single table to multiple objects.

Performance

  • Lazy loading - he loading of some data is deferred until it's needed. For the data through relations and for some columns.
  • Cache dynamically generated queries and Cache some data to avoid too many calls to the data source.
  • Handle cascade updates. Deleting a master record should delete the linked details if wished so.

Maintainability

  • What happens if the database schema changes? If I need to add a new collection?
  • Possibility to move to a new mapping tool
  • Serialization. Serialization can be used to persist data outside of the database. Serialization can be done into a binary format, or more important, in XML.

Non ORM specific criteria

  • Price
  • Performance
  • Resource consumption (memory)
  • Scalability
  • Complexity (or simplicity...)
  • Ease of use, time to be up and running
  • Documentation quality, support, forums, community
  • Maturity, frequency of updates, bug fixes, evolutions
  • Vendor's reputation and stability
  • Support for multiple platforms

Further reading

An excellent introduction to ORM can be found at http://www.chimu.com/publications/objectRelational/, a PDF version can be downloaded from http://www.chimu.com/publications/objectRelational/objectRelational.pdf.

References

  1. http://ar.rubyonrails.com/
  2. http://www.ksc.com/article5.htm
  3. http://www.hibernate.org
  4. http://www.oracle.com/technology/products/ias/toplink/index.html
  5. http://java.sun.com/jdo
  6. http://java.sun.com/javaee/technologies/persistence.jsp
  7. http://db.apache.org/ojb/
  8. http://en.wikipedia.org/wiki/Object-relational_mapping
  9. http://c2.com/cgi-bin/wiki?ObjectRelationalToolComparison
  10. http://en.wikipedia.org/wiki/List_of_object-relational_mapping_software