Patterns for mapping objects to relational databases

The Object-Relational Mismatch

Object-orientation and relational model are different paradigms of programming.[1] The object paradigm is based on building applications out of objects that have both data and behavior, whereas the relational paradigm is based on storing data. With the object paradigm, one traverses objects via their relationships whereas with the relational paradigm one duplicates data to join the rows in tables.[2] The difference between how object models work and relational databases is known as the “Object-Relational Impedance Mismatch”. This is just nerd-speak for “they are different”. In analogy, we could consider the example of the ball and shape game. It's the same thing; you can’t fit a square in a triangle hole.[10]

One way to avoid the impedance mismatch between objects and relations is to use an object-oriented database. However, systems often need to store objects in a relational database. Sometimes a system needs relational calculus or the maturity of a relational database. Other times the corporate policy is to use a relational database rather than an object-oriented database. Whatever the reason, a system that stores objects in a relational database needs to provide a design that reduces the impedance mismatch.[5] One of the secrets of success for mapping objects to relational databases is to understand both paradigms, and their differences, and then make intelligent trade offs based on that knowledge.[2] An application that maps between the two paradigms needs to be designed with respect to performance, maintainability and cost to name just a few requirements.[4]

Persistence frameworks were designed to reduce the amount of work needed to develop an object-oriented system that stores data in a relational database. For this reason, this framework area is often also referred to as object-relational mapping or ORM (see the below Figure). The goal of a persistence framework is to automatically map objects to and from a database seamlessly so that information can be quickly and easily retrieved in object format, and then subsequently stored in its relational structure. This aids in system development and maintenance greatly. The mundane details of writing one's own create, read, update, and delete functions are taken care of, so one can concentrate on designing and building the application.[11]

[11]

This page describes patterns for mapping objects to relations.

Persistence Layer Also Known As: Relational Database Access Layer Motivation: If you build a large object-oriented business system that stores objects in a relational database, you can spend a lot of time dealing with the problems of making your objects persistent. If you aren't careful, every programmer working on the system has to know SQL and the code can become tied to the database. It can be a lot of work to convert your system from using Microsoft Access to using DB2, or even adding a few variables to an object. You need to separate your domain knowledge from knowledge of how objects are stored in a database to protect developers from these types of changes. Problem: How do you save objects in a non object-oriented storage mechanism such as a relational database? Developers should not have to know the exact implementation. Forces: · Writing SQL code is easy for programmers who are familiar with databases. · Designing a good persistence mechanism takes time and is not directly related to providing features to users. · Database access is expensive and often needs to be optimized. · Developer should be able to solve the domain problem of an application without worrying about how to save the values to and from the database. · Having a common interface with template methods makes for better code reuse. · Using a single interface will force all classes to the lowest common denominator. · The type of persistent storage may change over the life of an application. · The domain model will probably change frequently over the life of an application. Solution: Provide a Persistence Layer that can populate objects from a data storage source and can save their data back to the data storage source. This layer should hide the developer from the details of storing objects. This is really a special case of building a Layer [BMRSS 96] to protect yourself from changes. All persistent objects use the standard interface of the Persistence Layer. If the data storage mechanism changes, only the Persistence Layer needs to be changed. For example, the corporate direction may be to start using Oracle rather than DB2 and then switch in midstream. The system needs to know how to store each object and to load it. Sometimes an object is stored in multiple databases over multiple media. An object that is part of a more complex object needs to keep track of which object it is a part; this is called the owning object. This owning object idea makes it easier to write complex queries. Therefore it is important for the Persistence Layer to provide a means to uniquely identify each object and its parent. This unique identifier and parent identifier are useful if you implement a Proxy [GHJV 95] pattern as a place holder for component objects. To summarize using the pattern names, this Persistence Layer provides the necessary methods to provide the CRUD operations by building up the SQL Code, providing the Attribute Mapping Methods, Type Converting the objects data values, accessing the Table Manager, providing access to a Transaction Manager, and connecting to the database through the Connection Manager. The Persistence Layer will also help provide proper Change Management and will collaborate with the OID Manager to provide unique objects identifiers. There are many ways to implement a Persistence Layer. Here are a few of them. 5 1. Use an Object Layer [Keller 98]. Subclass each domain object from an abstract PersistentObject. Each object would then inherit how to perform the necessary CRUD operations. This is the choice the example uses and the primary benefit is that it is easy to implement. Although it requires a little code in each domain class that is database specific, this code is segregated, and is easy to find and change. It allows optimization when necessary, though a heavily optimized system can be hard to understand. 2. Use a Broker that can read or write domain objects to or from the database. The broker must know the format of each domain object, and generates the SQL to read or write it. This choice separates the database code from the domain object class. It is the solution that scales best, though it requires a lot of infrastructure. 3. Compose each domain object from a set of data objects that have a one-to-one mapping to the database tables. Thus, whenever a domain object changes, it changes the corresponding data object that is then saved whenever the domain object is saved. For example, a Patient’s values could map to the Name and Address database tables. The Patient would have data objects mapping to the Name and Address database tables. ObjectShare’s ObjectLens for VisualWorks builds up database objects this way. The Persistence Layer is managed through these data objects. This alternative is simple to implement and easy to understand, though it can be slow and developers are forced to make a one-to-one mapping to database tables. The primary decision should be made based upon the forces of the needed flexibility, scalability, and maintainability. Example Implementation: There has been a lot of work done towards describing the details of building Brokers [BMRSS 96] correctly. Because of this and because we have more experience with implementing a Layered Object, our examples will focus around this implementation of the pattern. The other patterns described do work in the Broker implementation and we will mention briefly how this can be done in our discussions of the other patterns. However, all of our example code will describe the implementation as it relates to the PersistentObject. Figure 1 is a UML class diagram for an implementation of Persistent Layer which map domain objects to a relational database. Note that in this example, domain objects that need to be persisted are subclasses of the PersistentObject. The PersistentObject provides for the interface to the Persistence Layer. The PersistentObject interacts with the Table Manager, which will provide the physical table name for the SQL Code. During the SQL Code generation, the PersistentObject interacts with the Connection Manger to provide the necessary database connection. If needed, the OID Manager is queried for a new unique identifier. Thus, the PersistentObject is the central hub for any information the domain object requires but does not contain in the instance. The PersistentObject provides the standard interface to the Persistence Layer. Once the SQL Code has been prepared with the cooperation of the other patterns, the SQL statement is “fired” by the Database Components. In IBM’s VisualAge for Smalltalk these are the AbtDBM* applications. The attributes for the PersistentObject are as follows: · objectIdentifier – This is a unique identifier for the object and can be the database key. · isChanged – Identifies whether or not the object “isDirty” which tells the Persistence Layer to write the change to the database. · isPersisted – Identifies whether or not the object has ever been written to the database. · owningObject – Identifies the parent object (in a complex object) and is used as a foreign key in the database. Note that the foreign key is stored with the object not the parent object. The public methods for the PersistentObject are as follows: · save – Writes the object data to the database. It will update or insert rows as necessary. · delete – Deletes an object data from the database. 6 · load: - Returns a single instance of a class with data from the database. · loadAll – Returns a collection of instances of a class with all the data from the database for that particular class. This is useful for retrieving data for selection lists. · loadAllLike: - Returns a collection of instances of a class will select data from the database.

Concurrency Patterns

Read/Write Access Pattern

Conclusion

The concurrency patterns discussed above involve coordinating concurrent operations. They address two types of problems:[5]

Shared resources - When concurrent operations access the same data or another type of shared resource, operations may interfere with each other if they access the resource at the same time. To ensure that operations on shared resources execute correctly, the operations must be sufficiently constrained to access their shared resource one at a time. However, if the operations are overly constrained, then they may deadlock and not be able to finish executing.

Sequence of operations - If operations are constrained to access a shared resource one at a time, it may be necessary to ensure that they access the shared resource in a particular order. For example, an object cannot be removed from a data structure before it is added to the data structure.

References

[1] Wolfgang Keller, Mapping Objects to Tables - A Pattern Language http://www.objectarchitects.de/ObjectArchitects/papers/Published/ZippedPapers/mappings04.pdf

[2] Scott W. Ambler, President, Ronin International, Mapping Objects To Relational Databases http://www.crionics.com/products/opensource/faq/docs/mappingObjects.pdf

[3] Scott W. Ambler, Mapping Objects to Relational Databases: O/R Mapping In Detail http://www.agiledata.org/essays/mappingObjects.html

[4] Wolfgang Keller, Object/Relational Access Layers - A Roadmap, Missing Links and More Patterns http://citeseer.ist.psu.edu/4328.html

[5] Joseph W. Yoder, Ralph E. Johnson, Quince D. Wilson, Connecting Business Objects to Relational Databases http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.7703&rep=rep1&type=pdf

[6] Michael R. Blaha, William J. Premerlani and James E. Rumbaugh, Relational Database Design using an Object-Oriented Methodology http://www.sims.monash.edu.au/subjects/ims2501/seminars/oomodelling.pdf

[7] http://www.cetus-links.org/oo_db_systems_3.html

[8] http://www.ibm.com/developerworks/library/ws-mapping-to-rdb/

[9] http://www.objectarchitects.de/ObjectArchitects/orpatterns/index.htm?MappingObjects2Tables/mapping_object.htm

[10] http://cantgrokwontgrok.blogspot.com/2009/03/tech-day-1-nhibernate.html

[11] http://www.adobe.com/newsletters/edge/october2008/articles/article2/index.html?trackingid=DWZST