CSC/ECE 517 Fall 2010/ch2 2a CB
A History of Persistence
One problem that has plagued software developers for as long as persistent storage has existed is utilizing that storage to persist objects that exist in code. This way if a program or machine is rebooted then from a user’s perspective nothing is lost.
Initial solutions to this problem included writing the data in objects in string form to a text file to then be read back in by the program which would recreate these objects. This meant, however, that every program had a different way of writing out their own data and that if anything changed in the program objects a developer would have to ensure that no backwards compatibility was lost. Slightly more sophisticated methods would serialize the objects into binary to be written to disk in such a way that the program could then read the binary back and reconstruct the objects, which suffers from some of the same problems as writing the data as text.
However, even beyond the complexity of these solutions they still have their shortfalls, including such problems as managing extremely large and complex datasets efficiently. For instance, if a program contains 100,000 objects stored on disk and needs one of those objects, the developer would have to either read in every object one by one until they found the appropriate one, or have put in enough effort for the system to be intelligent and load only one. Either way the effort in creating these systems far outweighs any benefit.
What is Object Relational Mapping?
Object-relational mapping (or ORM) is the process of representing objects as abstract groups of data members. The members of the object have specific but generic types, such as text, number, etc. The intent of doing this is to easily translate these objects between systems, which may represent the objects and the abstract types they contain very differently.
A prime example of this, is the restrictions imposed by storing data into a database. Because the fields within a database table must be scalar in nature and primitive in type, the complex objects found in most object oriented software are not simple enough to place directly in a database table. One example is an object that contains an array of other objects. Because this array is potentially unbounded it is impossible to represent this in column form of a database. So something as simple as an array in a programming language may not be as trivial to translate to a database. Therefore, object-relational mapping can make the objects generic enough to pass around between different systems without having to know the eccentricities of each system.
The ActiveRecord Design Pattern
ActiveRecord, first put forth by Martin Fowler in his book Patterns of Enterprise Application Architecture, utilizes the concept of object-relational mapping to persist objects of a program in a database. This is accomplished by creating a table that will represent an object with the columns of that table representing the member variables of the object. Every object can then be written to and read from a single row inside of the database table allowing them to be persisted.
Because modern databases are designed to store large sets of data and be able to access them efficiently they offer the perfect storage medium for program objects. Also, because the code and the database used for persistence are separate, it allows the developer to switch the database being used without having to change any of the main program logic. This switch could be from a test database to a production database of the same kind or it could even be between databases of completely different types (like from Oracle to Microsoft SQL).
The goal of the pattern is for the developer to use objects as they would normally, and not have to worry about what SQL code needs to be written to perform the basic create, read, update, or delete (C.R.U.D operations) on those objects. However, as you will see in the following language extensions, the ActiveRecord design pattern has varying levels of involvement required to accomplish the same essential behavior.
Under the covers an ActiveRecord implementation has to do many interesting things to ensure all objects can indeed be translated into database rows. One example is arrays that were mentioned earlier which can be represented as simply a foreign key inside of a row which actually points to another database which contains the potential for an unbounded number of elements. Likewise, pointers between objects in memory are obviously no longer valid if they are stored and then retrieved after the program or machine has restarted. Thankfully, these can also be represented as foreign keys to another table which represents the object being pointed to.
At first glance, it may seem that ActiveRecord is more trouble than it’s worth to develop, especially when you consider that the SQL code must be tailored for each of the possible different backend databases. However, for any developer who has had to write custom code to persist objects, it takes only a few lines of ActiveRecord code to realize what a huge benefit is gained having a third party tool managing your persistence.
ActiveRecord in Ruby on Rails
The ActiveRecord package for Ruby (which comes in the default installation of Ruby on Rails), is by far the most popular ActiveRecord pattern implementation available for Ruby. This implementation of the ActiveRecord pattern adds two important object-oriented concepts that aren’t covered by the base ActiveRecord pattern: inheritance and associations. These two additions allow for a developer to more easily define the true relationships between objects and save the developer from having to define their own foreign keys.
The following is an example of using ActiveRecord in Ruby.
First a connection to the database must be specified:
ActiveRecord::Base.establish_connection( :adapter => "sqlite3", :dbfile => ":memory:" )
Then the structure of the tables which will contain the objects must be established:
ActiveRecord::Schema.define do create_table :albums do |table| table.column :title, :string table.column :artist, :string end create_table :tracks do |table| table.column :album_id, :integer table.column :track_number, :integer table.column :title, :string end end
Associations can then be defined for the classes because of the extra functionality added to ActiveRecord
class Album < ActiveRecord::Base has_many :tracks end class Track < ActiveRecord::Base belongs_to :album end
After this the code can then manipulate the objects utilizing the ActiveRecord functionality:
album = Album.create(:title => 'Black and Blue', :artist => 'The Rolling Stones') album.tracks.create(:track_number => 1, :title => 'Hot Stuff') album.tracks.create(:track_number => 2, :title => 'Hand Of Fate') album.tracks.create(:track_number => 3, :title => 'Cherry Oh Baby ') album.tracks.create(:track_number => 4, :title => 'Memory Motel ') album.tracks.create(:track_number => 5, :title => 'Hey Negrita') album.tracks.create(:track_number => 6, :title => 'Fool To Cry') album.tracks.create(:track_number => 7, :title => 'Crazy Mama') puts Album.find(1).tracks.length puts Album.find_by_title('Black and Blue').title puts Track.find_by_title('Fool To Cry').album_id
The main thing to notice in the code snippets above is that there is not a single line of SQL. Sans a small amount of setup, all a developer needs to do is instantiate objects as they normally would and use them. Among different ActiveRecord implementations, Ruby’s ActiveRecord is considered to be very well integrated into the language itself, meaning that the overall code that a developer would write without using ActiveRecord isn’t considerably different than when they do. Below we’ll look at ORM solutions for Java, Python, and Javascript and use ActiveRecord in Ruby as a benchmark for ease of use and language integration.
Java's Hibernate
Java is one of the most popular programming languages in existence today, and it therefore follows that there are many different options when it comes to ActiveRecord extensions to the language. One of the more popular extensions in use is Hibernate, an ORM solution for Java. Compared to the other languages presented here, Java is arguably the most involved from a developer’s perspective. By this, I mean that because it is strictly typed, compiled to byte code, and in general written at a lower level it is not always as straightforward and forgiving as the others. Keeping this in mind, we will try and gauge if the effort to set up and use Hibernate in a Java environment follows this same paradigm.
One of the first things that Hibernate uses for any interactions is a Session Factory class which is designed as a class factory to return the current Hibernate session object and ensure that only one instance of the session be issued per thread. Without listing the actual code of the Session Factory the code basically allows for getting the session object, opening the session for use, and closing the session after use.
Unlike some of the other languages the tables inside the database must be created by hand outside of Hibernate. This can be done using a specific SQL tool or using Java code that executes SQL code directly. This is a prime example of the additional knowledge a developer must have to correctly create a hibernate project. A developer must know that each object type must have its own table and that the fields within the table must correspond to the types of the object’s member variables.
Based on the tables created, classes must then be created with the member variables and setters/getters associated with the field types created in the database. For example, if the database table contains a field 'name' which is of type varchar and limited to 255 characters, the setter/getter must be of type String and the coder (if they wish) may limit the entry in the setter to 255 characters.
The next step is to create a Hibernate configuration file which essentially defines the database connection including the url, username, database type, password, etc. After this, the developer must explicitly tell Hibernate the relationships between the member variables of the class and the fields of the database with either an xml file or using Java annotations inside of the class. This is yet another example of the extra effort that has to be taken by the developer, but which also gives the developer great power and flexibility to name the fields and member variables different things.
Finally code can be written to create objects and write them to the database using the session object. The following snippet is an example of using the session object to write an object to the database.
Person p = new Person(); p.setFirstName(“John”); p.setLastName(“Doe”); Transaction tx = null; Session s1 = SessionFactory.getInstance().getCurrentSession(); try { tx = session.beginTransaction(); s1.save(p); tx.commit(); } catch(RuntimeException e) { tx.rollback(); }
There is no question that the set up effort involved in getting a Hibernate project going is substantially more than in ActiveRecord for Rails. The actual code to create and manipulate the objects is approximately the same between the two languages. This is clearly a case where more power is given to the developer but it is not without cost, which in this case is a fairly complex and involved list of set up steps.
Python's Django
Python, presents an interesting case because it is an extremely similar language to Ruby in syntax and somewhat in philosophy. The most popular ORM extension in use for Python is part of an MVC web frame work built for Python called Django (could be considered the equivalent to Rails for Ruby). Much like Rails includes an ActiveRecord extension, Django includes built-in ORM support as well.
Following the Django programming syntax, each class in Python must subclass the ‘django.db.models.Model’ class. Additionally each variable or attribute of the class will represent a field inside of the database. Because Python is loosely typed, Django provides types inside of the ‘models’ extension for specifying the type that needs to be created inside of the database. The following code is an example of such a class:
from django.db import models class Person(models.Model): first_name = models.CharField(max_length=30) last_name = models.CharField(max_length=30)
This code would then create a table of people objects where the first and last names would be columns in the table with some kind of varchar type with a length limit of 30. In addition, Django creates a field named ‘id’ which acts as the primary key for the table and is auto-incremented with each row. It is easy to override using this column as the primary key, but careful consideration should be taken before doing so.
Creating a ‘Person’ object can then be done using the following code:
p = Person(first_name="John", last_name="Doe") p.save() p.first_name //will return the first name field
Like ActiveRecord in Ruby, Django also provides ways to create relationships among the models. Creating these relationships in Django isn’t quite as well integrated as in Ruby as you actually have to specify that one model is a foreign key of another model for example (somewhat exposing the database nature of the persistence).
To then query the database to find the objects you can use the ‘objects’ variable on the model that was originally created which contains all the objects created in the particular table. The following example finds all the ‘Person’ objects with a first_name of John.
People = Person.objects.filter(first_name=”John”)
Both the ORM solution built into Django and ActiveRecord for ruby are very similar to one another, and both offer great programming ease for developers to persist their program objects. Django doesn’t integrate associations and inheritance as well as ActiveRecord, but everything else is extremely similar.
JazzRecord for JavaScript
JavaScript presents another good case study for the use of ORM and ActiveRecord in application development. For one, JavaScript, like Python and Ruby, is loosely typed which can make dealing with the strictness of database typing an interesting challenge for these languages. Secondly, JazzRecord was modeled specifically after ActiveRecord in Ruby but does have its own differences.
Declaring a model in JazzRecord is very similar to ActiveRecord in structure, where the types of each of the fields must be explicitly identified so that the correct table can be created.
var Songs = new JazzRecord.Model({ table: "songs", columns: { title: "text", artist: "text" } });
Like ActiveRecord, retrieving objects from the database is effortless and very transparent. Here is an example of querying a few objects and then using those objects.
var song1 = Songs.findBy(“title”, “Purple Haze”); var song2 = Songs.findBy(“title”, “Stairway to Heaven”); var sameArtist = (song1.artist == song2.artist); song2.artist = “NewArtist”; song2.save();
The JazzRecord team even decided to follow ActiveRecord in including the ability to identify associations between the different models. The following is an example of how to define associations between two models:
//new version of JazzLegend model Artist = new JazzRecord.Model({ table: "artists", foreignKey: "artist_id", hasMany: {albums: "albums"}, columns: { name: "text", genre: "text", } }); //associated model Album = new JazzRecord.Model({ table: "albums", belongsTo: {artist: "artists"}, columns: { title: "text", year: "text", artist_id: "number" } });
Conclusion
The good news for developers is that there doesn’t exist a popular programming language that does not also have numerous ORM extensions and tools to accompany it. For the most part, the complexity of these tools and their integration into the language depend on the style of the language itself. Hibernate, for example, is more complicated to set up, but rewards the skilled java programmer with more control and a more exposed approach to ORM. ActiveRecord for Rails, on the other hand, is intuitive and doesn’t require extensive effort to get something working, but does so at the sacrifice of sometimes limited control. Ultimately, it can’t be said that any of these extensions are any ‘better’ than any other, but that it really depends on the developer’s style and requirements. Regardless of the developer’s final decision however, there is no doubt that using ORM and ActiveRecord for persistence of objects in an application is an easy solution to the otherwise difficult problem of object persistence.
References
- Simple Ruby ActiveRecord Example. http://snippets.dzone.com/posts/show/3097 December 6, 2006
- Active Record — Object-relation mapping put on rails. http://ar.rubyonrails.org/ June 27, 2008
- Django ORM Documentation. http://docs.djangoproject.com/en/dev/topics/db/models/ September 13, 2010
- First Hibernate Example. http://www.laliluna.de/articles/first-hibernate-example-tutorial.html February 9, 2008
- Active Record Pattern. http://en.wikipedia.org/wiki/Active_record_pattern September 18, 2010
- Hibernate(Java). http://en.wikipedia.org/wiki/Hibernate_(Java) September 13, 2010
- Django(Web Framework). http://en.wikipedia.org/wiki/Django_(web_framework) September 15, 2010
- Demo JazzRecord. http://www.jazzrecord.org/demo/ 2010