CSC/ECE 517 Fall 2009/wiki3 12 obj to relational: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
Line 178: Line 178:


==== Motivation: ====
==== Motivation: ====
The values in a database do not always map directly into object types. For example, a boolean value may
be saved as a “T” or “F” in a database. Sex could be stored as an attribute in which there is a class called Sex, which has certain behavior associated with male instances and different behavior with female instances. The database might store the values as “M” and “F”. When the values are read in from the database, the “M” would need to be converted to a male instance of the Sex class and the “F” would need to be converted to a female instance of the Sex class. Type Conversion allows for object values to be converted to and from database values. The problem here is how to take objects that may not have a database type and allow for them to map to a database type and vice-versa.


==== Solution: ====
The solution would be to have all values convert their respective types through a Type Conversion object. This object knows how to handle nils and other mappings of objects to and from database values. When objects are persisted from large multi-application database the data formats can vary. This pattern ensures the data retrieved from the database is appropriate for the object.
Each object attribute is passed through the appropriate Type Converter to apply the data rules necessary for proper application or DBM use.
Implementations can vary dependent on your needs. The example that follows places all the conversion
methods in a single object. This keeps the methods localized to one area and allows the conversion object
to be dynamically switched, if necessary. Thus, you could have a Strategy that applies different conversion algorithms for different databases. If the converters are not needed by the rest of the application this is a “cleaner” approach.
Another option would be to extend each base class affected or used and place the methods there. Each
object would know how to convert itself into a format necessary for the database. If you were using
multiple databases then each of these methods would have to accommodate the differences in the formats.


==== Solution: ====
Yet another approach would be to put all of your conversion routines in PersistentObject thus
isolating the place where you would have to change the code if you need to map to a new database or the
conversion evolves. These methods would then be available to any object that inherits from
PersistentObject. A similar situation occurs, as in the previous option, when you have multiple
databases.
 
All of the above mentioned solutions will work regardless of the persistence mechanism chosen.





Revision as of 04:53, 18 November 2009

Patterns for mapping objects to relational databases

The Object-Relational Mismatch

Object-orientation and relational model are different paradigms of programming.[1] The object paradigm is based on building applications out of objects that have both data and behavior, whereas the relational paradigm is based on storing data. With the object paradigm, one traverses objects via their relationships whereas with the relational paradigm one duplicates data to join the rows in tables.[2] The difference between how object models work and relational databases is known as the “Object-Relational Impedance Mismatch”. This is just nerd-speak for “they are different”. In analogy, we could consider the example of the ball and shape game. It's the same thing; you can’t fit a square in a triangle hole.[10]

One way to avoid the impedance mismatch between objects and relations is to use an object-oriented database. However, systems often need to store objects in a relational database. Sometimes a system needs relational calculus or the maturity of a relational database. Other times the corporate policy is to use a relational database rather than an object-oriented database. Whatever the reason, a system that stores objects in a relational database needs to provide a design that reduces the impedance mismatch.[5] One of the secrets of success for mapping objects to relational databases is to understand both paradigms, and their differences, and then make intelligent trade offs based on that knowledge.[2] An application that maps between the two paradigms needs to be designed with respect to performance, maintainability and cost to name just a few requirements.[4]

Object-relational mapping.
Object-relational mapping.

Persistence frameworks were designed to reduce the amount of work needed to develop an object-oriented system that stores data in a relational database. For this reason, this framework area is often also referred to as object-relational mapping or ORM (see the below Figure). The goal of a persistence framework is to automatically map objects to and from a database seamlessly so that information can be quickly and easily retrieved in object format, and then subsequently stored in its relational structure. This aids in system development and maintenance greatly. The mundane details of writing one's own create, read, update, and delete functions are taken care of, so one can concentrate on designing and building the application.[11]

Patterns

This page describes patterns for mapping objects to relations.

Persistence Layer

Motivation:

If one builds a large object-oriented business system that stores objects in a relational database, one can spend a lot of time dealing with the problems of making one's objects persistent. Also, every programmer working on the system has to know SQL and the code can become tied to the database. It can be a lot of work to convert the system from using Microsoft Access to using DB2, or even adding a few variables to an object. One needs to separate one's domain knowledge from knowledge of how objects are stored in a database to protect developers from these types of changes. The problem is how to save objects in a non object-oriented storage mechanism such as a relational database with the developers not having to know the exact implementation.

Solution:

The solution is to provide a Persistence Layer that can populate objects from a data storage source and save their data back to the data storage source. This layer should hide the developer from the details of storing objects. All persistent objects use the standard interface of the Persistence Layer. If the data storage mechanism changes, only the Persistence Layer needs to be changed. For example, the corporate direction may be to start using Oracle rather than DB2 and then switch in midstream. The system needs to know how to store each object and to load it. Sometimes an object is stored in multiple databases over multiple media. An object that is part of a more complex object needs to keep track of which object it is a part; this is called the owning object. This owning object idea makes it easier to write complex queries. Therefore it is important for the Persistence Layer to provide a means to uniquely identify each object and its parent.

Example Implementation:

The figure shown below is a UML class diagram for an implementation of Persistent Layer which map domain objects to a relational database. Note that in this example, domain objects that need to be persisted are subclasses of the PersistentObject. The PersistentObject provides for the interface to the Persistence Layer. The PersistentObject interacts with the Table Manager, the Connection Manger and the OID Manager. Thus, the PersistentObject is the central hub for any information the domain object requires but does not contain in the instance.

Persistence Class Diagram. [12]

Records can be read in three ways:

  • a single row (PersistentObject>>load:),
  • all records (PersistentObject>>loadAll),
  • or all that match a specific criteria (PersistentObject>>loadAllLike:).

The specific criteria is provided by creating a new instance of the object to load with the attributes set needed to find a match. This functions with the PersistentObject>>load: and PersistentObject>>loadAllLike: methods. The PersistentObject>>loadAll method is useful for retrieving reference data from tables where you want to populate a selection list or drop down list.

The following is example code from the PersistentObject described above.

Protocol for Public Interface PersistentObject (instance)

load
   "Answer a single instance of a subclasse of PersistentObjects that matchs self.”
   | oc |
   oc := self loadAllLike.
   ^oc isEmpty ifTrue: [nil] ifFalse: [oc first]
loadAllLike
   "Answer a collection of subclasses of PersistentObjects that match self. The selectionClause method  
   in the Domain object is called to prepare the WHERE clause for the read method in the 
   PersistentObject"
   ^self class read: ( self selectionClause )
save
   “Saves self to the database wrapped in a transaction.”
   self class beginTransaction.
   self saveAsTransaction.
   self class endTransaction.
delete
   “Deletes self from the database wrapped in a transaction.”
   self class beginTransaction.
   self deleteAsTransaction.
   self class endTransaction.

Protocol for Public Interface PersistentObject (class)

loadAll
   "Answer a collection of ALL my instances from the database."
   ^self read: nil.

Advantages & Disadvantages:

An important benefit of isolating the application developer from the details of how an object saves itself is that it makes it easier to implement domain objects. Thus, it is less work to evolve the domain model. By encapsulating the functionality of the object persistence mechanism, a developer will effectively be hidden from the details of saving the object. However, a Persistent Layer can make it complicated and sometimes difficult to do operations that might be easy to write in SQL Code. Also, optimizations can be difficult in a Persistent Layer.

CRUD

Motivation:

Consider the example of a Patient class with components of class Name and Address. When one reads a Patient, one must also read Name and Address. Writing out a Patient will probably cause writing Name and Address objects to the database. The question that arises here is should they all have the same interface for reading and writing. Maybe some object requires a different interface to the database. Can we give them all the same interface? If so, what should it be?

Any persisted object needs operations to read from and write to the database. Values being persisted may also be for newly created objects. Sometimes, objects may need to be deleted from the persistence storage. Therefore, anytime an object needs to be persistent, it is important to provide, at a minimum, create, read, update, and delete operations. The question (and the problem) is what minimal operations are needed for a persistent object.

Solution:

The solution is to provide the basic CRUD (create, read, update, and delete) operations for persistent objects. Other operations that may be needed are loadAllLike: or loadAll. The important point is to provide at least enough to instantiate objects from a database and store newly created or changed objects. If all domain objects have a common PersistentObject superclass, then this class can define the CRUD operations and all domain objects can inherit them. Subclasses can override them if necessary to increase performance. Thus, each domain object must make available a description of the SQL code necessary for accessing the database available for the CRUD operations. Thus CRUD works closely with a SQL Code Description for insuring that the operations are sufficient for persisting domain objects.


Example Implementation:

The PersistentObject described above provides the standard interface to the basic set of operations for mapping the objects to the database; save, load, etc. These methods are inherited from the PersistentObject which access the CRUD operations. Some of these CRUD methods may need to be overwritten by the domain object. The updateRowSql and the insertRowSql are described in the SQL Code Description pattern below.

Protocol for CRUD PersistentObject (class)

This method takes a WHERE clause as an agent, and returns a collectin of objects that correspond to the rows that matches the WHERE clause.

read: aSearchString
   ”Returns a collection of instances populated from the database.”
   | aCollection |
   aCollection := OrderedCollection new.
   (self resultSet: aSearchString)
      do: [:aRow | aCollection add: (self new initialize: aRow)].
   ^aCollection

Protocol for Persistence Layer PersistentObject (instance)

These methods save or delete objects from the database. These methods make the decisions of what kind of SQL statements (insert, update, or delete) based on the object’s values. Once the decisions have been made the SQL statement is then fired to the database.

saveAsTransaction
   "Save self to the database."
   self isPersisted ifTrue: [self update] ifFalse: [self create].
   self makeClean
update
   "Updates aggregate classes then updates self to the database"
   self saveComponentIfDirty.
   self basicUpdate
create
   "Inserts aggregate classes then inserts self to the database."
   self saveComponentIfDirty.
   self basicCreate
basicCreate
   "Fires the insert SQL statement to the database"
   self class executeSql: self insertRowSql.
   isPersisted := true
basicUpdate
   "Fires the update SQL statement to the database."
   (self isKindOf: AbstractProxy) ifTrue: [^nil].
   isChanged ifTrue: [self class executeSql: self updateRowSql]
deleteAsTransaction
   "Delete self from the database.."
   self isPersisted ifTrue: [self basicDelete].
   ^nil
basicDelete
   "Fires the delete SQL statement to the database."
   self class
   executeSql:('DELETE FROM ',self class table,' WHERE ID_OBJ=',
                             (self objectIdentifier printString)).

Advantages & Disadvantages:

Once the object model and data model have been analyzed, the results can be implemented in CRUD providing a performance-optimized solution, thus isolating the developer from having to worry about performance details. Flexibility to retrieve data based on how many rows or what kind of data (dynamic, static, or somewhere in-between) is needed. It has a simple implementation to SAVE the data to, or back to, the database. The application developer does not have to determine whether to insert or update the object. However, CRUD can cause sub-optimal performance if the object model and data model have not been properly analyzed. This will make the job of the developer more difficult if they have to compensate another way.

Type Conversion

Motivation:

The values in a database do not always map directly into object types. For example, a boolean value may be saved as a “T” or “F” in a database. Sex could be stored as an attribute in which there is a class called Sex, which has certain behavior associated with male instances and different behavior with female instances. The database might store the values as “M” and “F”. When the values are read in from the database, the “M” would need to be converted to a male instance of the Sex class and the “F” would need to be converted to a female instance of the Sex class. Type Conversion allows for object values to be converted to and from database values. The problem here is how to take objects that may not have a database type and allow for them to map to a database type and vice-versa.

Solution:

The solution would be to have all values convert their respective types through a Type Conversion object. This object knows how to handle nils and other mappings of objects to and from database values. When objects are persisted from large multi-application database the data formats can vary. This pattern ensures the data retrieved from the database is appropriate for the object.

Each object attribute is passed through the appropriate Type Converter to apply the data rules necessary for proper application or DBM use.

Implementations can vary dependent on your needs. The example that follows places all the conversion methods in a single object. This keeps the methods localized to one area and allows the conversion object to be dynamically switched, if necessary. Thus, you could have a Strategy that applies different conversion algorithms for different databases. If the converters are not needed by the rest of the application this is a “cleaner” approach.

Another option would be to extend each base class affected or used and place the methods there. Each object would know how to convert itself into a format necessary for the database. If you were using multiple databases then each of these methods would have to accommodate the differences in the formats.

Yet another approach would be to put all of your conversion routines in PersistentObject thus isolating the place where you would have to change the code if you need to map to a new database or the conversion evolves. These methods would then be available to any object that inherits from PersistentObject. A similar situation occurs, as in the previous option, when you have multiple databases.

All of the above mentioned solutions will work regardless of the persistence mechanism chosen.


Example Implementation:

Advantages & Disadvantages:

Change Manager

Motivation:

Solution:

Example Implementation:

Advantages & Disadvantages:

Transaction Manager

Motivation:

Solution:

Example Implementation:

Advantages & Disadvantages:

Connection Manager

Motivation:

Solution:

Example Implementation:

Advantages & Disadvantages:

Table Manager

Motivation:

Solution:

Example Implementation:

Advantages & Disadvantages:

Conclusion

The concurrency patterns discussed above involve coordinating concurrent operations. They address two types of problems:[5]

Shared resources - When concurrent operations access the same data or another type of shared resource, operations may interfere with each other if they access the resource at the same time. To ensure that operations on shared resources execute correctly, the operations must be sufficiently constrained to access their shared resource one at a time. However, if the operations are overly constrained, then they may deadlock and not be able to finish executing.

Sequence of operations - If operations are constrained to access a shared resource one at a time, it may be necessary to ensure that they access the shared resource in a particular order. For example, an object cannot be removed from a data structure before it is added to the data structure.

See Also

References

[1] Wolfgang Keller, Mapping Objects to Tables - A Pattern Language http://www.objectarchitects.de/ObjectArchitects/papers/Published/ZippedPapers/mappings04.pdf

[2] Scott W. Ambler, President, Ronin International, Mapping Objects To Relational Databases http://www.crionics.com/products/opensource/faq/docs/mappingObjects.pdf

[3] Scott W. Ambler, Mapping Objects to Relational Databases: O/R Mapping In Detail http://www.agiledata.org/essays/mappingObjects.html

[4] Wolfgang Keller, Object/Relational Access Layers - A Roadmap, Missing Links and More Patterns http://citeseer.ist.psu.edu/4328.html

[5] Joseph W. Yoder, Ralph E. Johnson, Quince D. Wilson, Connecting Business Objects to Relational Databases http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.7703&rep=rep1&type=pdf

[6] Michael R. Blaha, William J. Premerlani and James E. Rumbaugh, Relational Database Design using an Object-Oriented Methodology http://www.sims.monash.edu.au/subjects/ims2501/seminars/oomodelling.pdf

[7] http://www.cetus-links.org/oo_db_systems_3.html

[8] http://www.ibm.com/developerworks/library/ws-mapping-to-rdb/

[9] http://www.objectarchitects.de/ObjectArchitects/orpatterns/index.htm?MappingObjects2Tables/mapping_object.htm

[10] http://cantgrokwontgrok.blogspot.com/2009/03/tech-day-1-nhibernate.html

[11] http://www.adobe.com/newsletters/edge/october2008/articles/article2/index.html?trackingid=DWZST

[12] Joseph W. Yoder, Patterns for Making your Business Objects Persistent in a Relational Database http://www.joeyoder.com/Research/objectmappings/mappingobjectstordbms/ppframe.htm