CSC/ECE 517 Fall 2009/wiki3 12 obj to relational: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
 
(144 intermediate revisions by the same user not shown)
Line 4: Line 4:
== The Object-Relational Mismatch ==
== The Object-Relational Mismatch ==


[[Image:Fp1059-shape-ball.JPG |thumb]] [10]
[[Image:Fp1059-shape-ball.JPG |right|300 px]]
Object-orientation and relational model are different paradigms of programming.[1] The object paradigm is based on building applications out of objects that have both data and behavior,
[http://en.wikipedia.org/wiki/Object-oriented_programming Object-orientation] and [http://en.wikipedia.org/wiki/Relational_model relational model] are different paradigms of programming.[1] The object paradigm is based on building applications out of [http://en.wikipedia.org/wiki/Object_%28computer_science%29 objects] that have both data and behavior, whereas the relational paradigm is based on storing data. With the object paradigm, one traverses objects via their relationships whereas with the relational paradigm one duplicates data to [http://en.wikipedia.org/wiki/Join_%28SQL%29 join] the rows in tables.[2] The difference between how object models work and relational databases is known as the “[http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch Object-Relational Impedance Mismatch]”. This is just nerd-speak for “they are different”. In analogy, we could consider the example of the ball and shape game. It's the same thing; you can’t fit a square in a triangle hole.[10]
whereas the relational paradigm is based on storing data. With the object paradigm, one traverses objects via their relationships whereas with the relational paradigm one duplicates data to join the rows in tables.[2] The difference between how object models work and relational databases is known as the “Object-Relational Impedance Mismatch”. This is just nerd-speak for “they are different”. In analogy, we could consider the example of the ball and shape game. It's the same thing; you can’t fit a square in a triangle hole.[10]


One way to avoid the impedance mismatch between objects and relations is to use an object-oriented
One way to avoid the impedance mismatch between objects and relations is to use an [http://en.wikipedia.org/wiki/Object_database object-oriented database]. However, systems often need to store objects in a relational database. Sometimes a system needs relational calculus or the maturity of a [http://en.wikipedia.org/wiki/Relational_database relational database]. Other times the corporate policy is to use a relational database rather than an object-oriented database. Whatever the reason, a system that stores objects in a relational database needs to provide a design that reduces the impedance mismatch.[5] One of the secrets of success for mapping objects to relational databases is to understand both paradigms, and their differences, and then make intelligent trade offs based on that knowledge.[2] An application that maps between the two paradigms needs to be designed with respect to performance, maintainability and cost to name just a few requirements.[4]
database. However, systems often need to store objects in a relational database. Sometimes a system needs
relational calculus or the maturity of a relational database. Other times the corporate policy is to use a relational database rather than an object-oriented database. Whatever the reason, a system that stores
objects in a relational database needs to provide a design that reduces the impedance mismatch.


This paper describes only a part of a pattern language for mapping objects to relations, but it describes the
[[Image:Object-relational mapping.jpg|right|200 px|Object-relational mapping.]]
patterns that we thought were not adequately described elsewhere.[5]


[http://www.roseindia.net/enterprise/persistenceframework.shtml Persistence frameworks] were designed to reduce the amount of work needed to develop an object-oriented system that stores data in a relational database. For this reason, this framework area is often also referred to as [http://en.wikipedia.org/wiki/Object-relational_mapping object-relational mapping] or ORM (see the below Figure). The goal of a persistence framework is to automatically map objects to and from a database seamlessly so that information can be quickly and easily retrieved in object format, and then subsequently stored in its relational structure. This aids in system development and maintenance greatly. The mundane details of writing one's own create, read, update, and delete functions are taken care of, so one can concentrate on designing and building the application.[11]


==Patterns==


This page describes patterns for mapping objects to relations.


The Object-Relational Mismatch
=== Persistence Layer ===


One of the secrets of success for mapping
==== Motivation: ====
objects to relational databases is to understand both paradigms, and their differences, and then make
intelligent tradeoffs based on that knowledge.[2]


If one builds a large object-oriented business system that stores objects in a relational database, one can spend a lot of time dealing with the problems of making one's objects [http://dictionary.reference.com/browse/persistent persistent]. Also, every programmer working on the system has to know [http://en.wikipedia.org/wiki/SQL SQL] and the code can become tied to the database. It can be a lot of work to convert the system from using [http://office.microsoft.com/en-us/access/default.aspx Microsoft Access] to using [http://en.wikipedia.org/wiki/IBM_DB2 DB2], or even adding a few variables to an object. One needs to separate one's domain knowledge from knowledge of how objects are stored in a database to protect developers from these types of changes. The problem is how to save objects in a non object-oriented storage mechanism such as a relational database with the developers not having to know the exact implementation.[5]


Unfortunately we need to deal with the object relational (O/R) impedance mismatch, and to do so you need to understand two things: the
==== Solution: ====
process of mapping objects to relational databases and how to implement those mappings.[3]


Designing software to connect an object-oriented business system with a relational database is
The solution is to provide a Persistence Layer that can populate objects from a data storage source and save their data back to the data storage source. This layer [http://en.wikipedia.org/wiki/Encapsulation_%28computer_science%29#Encapsulation encapsulates] the behavior needed to make objects persistent, in other words to read, write, and delete objects to/from permanent storage. Persistence layer allows application developers to concentrate on what they do best, develop applications, without having to worry about how their objects will be stored. Furthermore, persistence layers should also allow [http://en.wikipedia.org/wiki/Database_administrator database administrators (DBAs)] to do what they do best, administer databases, without having to worry about accidentally introducing bugs into existing applications. With a well-built persistence layer DBAs should be able to move tables, rename tables, rename columns, and reorganize tables without affecting the applications that access them.[16]
a tedious task. Object-orientation and the relational paradigm differ quite a bit. An application
that maps between the two paradigms needs to be designed with respect to performance,
maintainability and cost to name just a few requirements. Luckily there are numerous patterns
of object/relational access layers, but looking at the body of pattern literature you will find that
some patterns are still to be mined, while there's no generative "one stop" pattern language for
the problem domain. This paper provides a systematic roadmap of the patterns in the field,
and fills some pot holes on the road towards a full pattern language for object/relational access
layers by providing some missing patterns and links.[4]


==== Example Implementation: ====


The figure below presents a high-level design (Ambler, 1998b) of a robust persistence layer. An interesting feature of the design is that an application programmer only needs to know about the following classes to make their objects persistent: PersistentObject, the PersistentCriteria class hierarchy, PersistentTransaction, and Cursor. The other classes are not directly accessed by application development code but will still need to be developed and maintained to support the [http://en.wikipedia.org/wiki/Class_%28computer_science%29 "public" classes.][16]


[[Image:Overview of the design for a persistence layer.jpg|700 px|Overview of the design for a persistence layer.]] [16]


The PersistentObject Class encapsulates the behavior needed to make a single object persistent and is the class from which all classes in your problem/business domain inherit from. The PersistentObject has three attributes, isProxy, isPersistent, and timeStamp which respectively indicate whether or not an object is a proxy, if it was retrieved from a persistence mechanism, and the timeStamp assigned by the persistence mechanism for when it was last accessed by your application.  The attribute
isPersistent is important because an object needs to know if it already exists in the persistence mechanism or if it was newly created, information that is used to determine if an insert or update SQL statement needs to be generated when saving the object. The timeStamp attribute is used to support optimistic locking in the persistence mechanism. When the object is read into memory its timeStamp is updated in the persistence mechanism. When the object is subsequently written back the timeStamp is first read in and compared with the initial value – if the value of timeStamp has changed then another user has worked with the object and there is effectively a collision which needs to be rectified (typically via the display of a message to the user).


PersistentObject implements three methods – save(), delete(), and retrieve() – messages which are sent to objects to make them persistent. The implication is that application programmers don’t need to have any knowledge of the persistence strategy to make objects persistent, instead they merely send objects messages and they do the right thing. This is what encapsulation is all about. PersistentObject potentially maintains a relationship to an instance of OID, which is done whenever object IDs are used for the unique keys for objects in the persistence mechanism. This is optional because you don’t always have the choice to use object IDs for keys, very often you are forced to map objects to a legacy schema. The need to map to legacy schemas is an unfortunate reality in the object-oriented development world, something that we’ll discuss later in this white paper we look at how the map classes are implemented. Anyway, you can easily have PersistentObject automatically assign object IDs to your objects when they are created if you have control over your persistence schema. [16]


==== Advantages & Disadvantages: ====


An important benefit of isolating the application developer from the details of how an object
saves itself is that it makes it easier to implement domain objects. Thus, it is less work to evolve the
[http://en.wikipedia.org/wiki/Domain_model domain model]. By [http://en.wikipedia.org/wiki/Information_hiding encapsulating] the functionality of the object persistence mechanism, a developer will effectively be hidden from the details of saving the object. However, a Persistent Layer can make it complicated and sometimes difficult to do operations that might be easy to write in SQL Code. Also, optimizations can be difficult in a Persistent Layer.[5]


=== CRUD ===


==== Motivation: ====
Consider the example of a Patient [http://en.wikipedia.org/wiki/Class_%28computer_science%29 class] with components of class Name and Address. When one reads a Patient, one must also read Name and Address. Writing out a Patient will probably cause writing Name and Address objects to the database. The question that arises here is should they all have the same interface for reading and writing. Maybe some object requires a different interface to the database. Can we give them all the same interface? If so, what should it be?


[[Image:Object-relational mapping.jpg|Object-relational mapping.]]
Any persisted object needs operations to read from and write to the database. Values being persisted may
also be for newly created objects. Sometimes, objects may need to be deleted from the persistence storage. Therefore, anytime an object needs to be persistent, it is important to provide, at a minimum, create, read, update, and delete operations. The question (and the problem) is what minimal operations are needed for a persistent object.[5]


==Concurrency Patterns==
==== Solution: ====
The solution is to provide the basic [http://en.wikipedia.org/wiki/Create,_read,_update_and_delete CRUD] (create, read, update, and delete) operations for persistent objects. The important point is to provide at least enough to instantiate objects from a database and store newly created or changed objects. If all domain objects have a common PersistentObject [http://en.wikipedia.org/wiki/Superclass_%28computer_science%29 superclass], then this class can define the CRUD operations and all domain objects can [http://en.wikipedia.org/wiki/Inheritance_%28computer_science%29 inherit] them. Subclasses can [http://en.wikipedia.org/wiki/Method_overriding override] them if necessary to increase performance. Thus, each domain object must make available a description of the SQL code necessary for accessing the database available for the CRUD operations. Thus CRUD works closely with a SQL Code Description for insuring that the operations are sufficient for persisting domain objects.[5]


=== Read/Write Access Pattern ===
==== Example Implementation: ====


[[Image:Create.jpg|700 px|Create.]] [17]
[[Image:Read Update.jpg|700 px|Read Update.jpg.]] [17]
[[Image:Delete.jpg|700 px|Delete.]] [17]


==== Advantages & Disadvantages: ====


== Conclusion ==
Once the object model and data model have been analyzed, the results can be implemented in
The concurrency patterns discussed above involve coordinating concurrent operations. They address two types of problems:[5]
CRUD providing a performance-optimized solution, thus isolating the developer from having to worry
about performance details. Flexibility to retrieve data based on how many rows or what kind of data (dynamic, static, or somewhere in-between) is needed. It has a simple implementation to SAVE the data to, or back to, the database. The application developer does not have to determine whether to insert or update the object. However, CRUD can cause sub-optimal performance if the object model and data model have not been properly analyzed. This will make the job of the developer more difficult if they have to compensate another way.[5]


'''Shared resources''' - When concurrent operations access the same data or another type of shared resource, operations may
=== Type Conversion ===
interfere with each other if they access the resource at the same time.  To ensure that operations on shared resources
execute correctly, the operations must be sufficiently constrained to access their shared resource one at a time.  However,
if the operations are overly constrained, then they may [http://en.wikipedia.org/wiki/Deadlock deadlock] and not be able to finish executing.


'''Sequence of operations''' - If operations are constrained to access a shared resource one at a time, it may be necessary to  
==== Motivation: ====
ensure that they access the shared resource in a particular order.  For example, an object cannot be removed from a data
The values in a database do not always map directly into [http://download.oracle.com/docs/cd/B12037_01/appdev.101/b10807/13_elems031.htm object types]. For example, a boolean value may
structure before it is added to the data structure.
be saved as a “T” or “F” in a database.[13] Type Conversion allows for object values to be converted to and from database values.[5] The problem here is it should be possible to store in database object properties that have data types with no direct mapping to database types.[13]
 
==== Solution: ====
The solution would be to have all values convert their respective types through a Type Conversion object.  This pattern ensures the data retrieved from the database is appropriate for the object. Each object attribute is passed through the appropriate Type Converter to apply the data rules necessary for proper application or DBM use.[5]
 
==== Example Implementation: ====
 
To consider an example of the above pattern, we consider the following approach based on the Microsoft .Net platform. Each data type has its own converter class. Base abstract class for each converter is
SpfDataType class. It implements the following functionality: [13]
 
public class SpfDataType
{
    public virtual bool IsValidValue(Object p_Value);
    /// validate if value is correct
    public virtual string GetErrorMessage(
    object p_FieldValue);
    /// error message for incorrect value
    public bool IsNullable;
    /// is NULL allowed?
    public virtual object GetStorageValue(object p_Value);
    /// type conversion for database storage
    public abstract object ConvertValue(object p_Value);
    /// type conversion for object
    public abstract Type BaseType;
    /// basic .Net type for this type
    public virtual int StorageLength;
    /// storage data length
    public virtual string GetFormattedValue(object p_Value);
    /// formatted string value
}
 
Converters are associated with object fields through attributes as shown in the above example. To use a custom type in business object with persistence framework a custom class derived from SpfDataType should be defined and appropriate fields should be marked with this new converter. Converters for all .Net
build-in types are part of the framework.[13]
 
==== Advantages & Disadvantages: ====
 
Type Conversion can help insure [http://en.wikipedia.org/wiki/Data_consistency data consistency], object types can vary regardless of the database types and default values can be assigned for empty values from the database. It prevents “Undefined object does not understand” errors and provides increased RMA for the applications. (Reliability, Maintainability, and Accessibility). However, it can be time consuming to convert types, especially when reading in many values from a database.[5]
 
 
=== Change Manager ===
 
==== Motivation: ====
Many objects need access to shared values, but the values are not unique throughout the system. The problem here is how to tell that an object has changed and needs to be saved to the data source; how to prevent unnecessary access to the data source.[12] In general, a Persistence Layer should keep track of all PersistentObjects that have changed state, and should make sure that they are all written back to the database.[5]
 
==== Solution: ====
The solution is to set up a Change Manager that keeps track of any PersistentObject that changes one of its persistent attributes. This Change Manager will be used whenever any requests to saving objects is needed. One way to do this is by inheriting from a PersistentObject, which has a dirty bit that gets set
whenever one of the attributes that maps to the database changes. This dirty bit is usually an instance
variable with a boolean value which indicates when an objects values have changed. When the boolean
value is set, the PersistentObject will save the new values to the database when a request to save it
is invoked. If the boolean value is not set, the PersistentObject will bypass the write to the
database.
 
[http://en.wikipedia.org/wiki/Data_access Data access] is generally very expensive and should be used sparingly. By being able to identify that an object does or does not need to be written to the database can increase performance drastically. In addition to performance considerations, the user interface can benefit by being able to prompt the user to save before exiting. This ability adds to the user accepting your application by knowing that if they forget to save the changed information, the system will prompt them to save it. The user can come to the conclusion that the application is useless very quickly if they have to reenter the same data more than once.
 
Another feature that the Change Manager can provide is that of remembering original state or changed state of an object. If the user of your system needs to be able to revert back to original state, the Change Manager could keep around the original values so that an undo call could be invoked. Also, you might want to provide your system with the ability to have multiple undos.[5]
 
==== Example Implementation: ====
 
The example will show how an [http://www.telecom.ece.ntua.gr/HTML.Tutorials/java/strings/accessors.html accessor method] for the first name attribute in the Name class sets the [http://www.pcmag.com/encyclopedia_term/0,2542,t=dirty+bit&i=41458,00.asp dirty bit] whenever the attributes changes its value. The accessor is the default generated by VisualAge with the
addition of the makeDirty call, which sets the inherited attribute of isChanged to be true.
first: aString
    "Save the value of first."
    self makeDirty.
    first := aString.
    self signalEvent: #first
          with: aString.
 
Protocol for Change Manager PersistentObject (instance)
These methods provide the Persistence Layer with the ability to change the dirty flag. This saves the
Persistence Layer from having to write data to the database that has not changed. It also provides the [http://en.wikipedia.org/wiki/Graphical_user_interface GUI]
programmer a way to test the object in order to provide the user a message to save their data or not.
 
  makeDirty
    "Indicates an object needs to be saved to the db. This method can be called from the ‘setter’ method as
    in the example above.”
    isChanged:=true.
 
makeClean
    "Indicates that the object does not need to be saved to the db or that no changes have affected the
    object"
    isChanged:=false.
[5]
 
==== Advantages: ====
 
The user will more readily accept the application. It will result in a better performance by not writing object that has not changed back to the database. When merging data between databases the flag can be set so the record will be inserted to the new database as necessary.[5]
 
=== Transaction Manager ===
 
==== Motivation: ====
 
When saving objects, it is important to allow for objects to be stored in such a way that if some values are not stored properly, previously stored objects can be rolled back.[5] Also, it is important to have an ability to lock a set of objects that a particular user is updating or deleting if a business system is to support multiple users accessing data simultaneously.[14]
 
==== Solution: ====
 
Reese (1997) introduced the concept of a Lockholder who owns a lock object that is passed to every business object that the Lockholder might change. A business application for an academic records system might instantiate a Person object for each user that logs in to the system. In order to “become” a Lockholder, the Person object would need to “implement” a Lockholder Interface. In Java, an Interface is simply a set of empty methods and any object that wishes to implement an Interface has to provide method bodies for all of them (it implements the methods). A unique lock object would be created for the Lockholder, which would be passed as a parameter to the business object every time the Lockholder tries to take possession of an object. If another Lockholder attempts to modify an object held by that Lockholder, a lock error would be thrown since a comparison of the lock already associated with the object and that owned by the new Lockholder would reveal that they were not the same. Most operations that are performed on an object oriented business system involve making changes or creating more than one object at a time.  Reese (1997) used the concept of a Transaction object that is held by the Lock object. The Transaction object is aware of all the business objects that have been locked by the Lock object. When changes are made to business objects, the Transaction object only commits these changes to the data store if all of them were successful.[14]
 
==== Example Implementation: ====
 
The following code is from excerpts from the PersistentObject. These show the wrapper effect for
implementing transaction management. The code (self class beginTranaction) tells the
database to start and stop a transaction will be dependant on which database you are using and what
language you are developing in. You may need to extend transactions to handle your specific
implementation. Also, if the database you are saving to (maybe a flat file) doesn’t support transactions,
you are going to have to develop a complete Transaction Manager to support the needs of your users. This
example wraps all saves and deletes to an object with a beginTransaction and an
endTransaction. The class methods make the call to the database connection class.
 
Protocol for Public Interface PersistentObject (instance)
 
save
    self class beginTransaction.
    self saveAsTransaction.
    self class endTransaction.
 
delete
    self class beginTransaction.
    self deleteAsTransaction.
    self class endTransaction.
 
Protocol for Public Interface PersistentObject (class)
 
beginTransaction
    self databaseConnection beginUnitOfWorkIfError:
                        [ self databaseConnection rollbackUnitOfWork ]
 
endTransaction
    self databaseConnection commitUnitOfWork
 
rollBackTransaction
    self databaseConnection rollbackUnitOfWork
[5]
 
==== Advantages & Disadvantages: ====
 
The advantages of the Transaction Manager are that the complex objects are saved completely or not at all and [http://en.wikipedia.org/wiki/Referential_integrity referential integrity] is maintained. However, all other applications writing to the database have to use the Transaction Manager. You may have to add checks to see if anyone else has changed the database values that you are saving. Unless there is support built in from the database, writing a complete transaction manager is hard.[5]
 
=== Other Patterns ===
 
The other patterns include SQL Code Description, Attribute Mapping Methods, OID Manager and Connection Manager. We provide a brief description of these patterns here in order to understand the whole picture of how all the patterns collaborate with each other. (See "Putting It All Together").
 
The SQL Code Description defines the actual SQL code that takes the values from the [http://en.wikipedia.org/wiki/Relational_database_management_system RDBMS] or other database and retrieves them for the object’s use and vice-versa. It is used to generate the SQL for performing the CRUD operations. The Attribute Mapping Methods pattern maps the values between the values from columns in the database and values being stored in an object's attributes. This pattern also handles complex object mappings. It populates the object(s) with the row values. Since every object is unique in any object oriented system, it is important to create unique identifiers for new objects with an OID Manager. Any system accessing a RDMBS will need to provide connections to the desired databases through some sort of Connection Manager.[5]
 
== Putting It All Together (Comparison of all patterns) ==
 
All of these patterns collaborate together to provide a mechanism for mapping persistent objects to a database. The figure below shows how the patterns interact with one another. The Persistence Layer provides the standard interface for the CRUD (create, read, update, and delete) operations needed to persist domain objects. The Persistence Layer builds the calls to the database by using the SQL Code Description provided by the domain objects. During the generation of the SQL code, the Persistence Layer interacts with the Table
Manager to get the correct database table and column names. When values are being returned from the
database or written back to the database, attributes values must be mapped to database column names and
vice-versa. This is done with Attribute Mapping Methods. Attribute Mapping Methods do some Type
Conversions while the SQL code is being generated. Attribute Mappings and Type Conversions can also
happen while the Persistence Layer instantiates a new object. The Persistence Layer saves objects to the
database through the Connection Manager only when an object’s value has changed. This change is
managed by the Change Manager. The Connection Manager could interact with the Table Manager to
decide on which database to use. The Persistence Layer provides access to a Transaction Manager
whenever transaction processing is needed.[5]
 
[[Image:Pattern Interaction Diagram.jpg|700 px|Pattern Interaction Diagram.]] [5]
 
== Mapping Terminology ==
 
Mapping (v).
The act of determining how objects and their relationships are persisted in permanent data storage, in this case relational databases.[3]
 
Mapping (n).
The definition of how an object’s property or a relationship is persisted in permanent storage.[3]
 
Property.
A data attribute, either implemented as a physical attribute such as the string firstName or as a virtual attribute implemented via an operation such as getTotal() which returns the total of an order.[3]
 
Property mapping.
A mapping that describes how to persist an object’s property.[3]
 
Relationship mapping.
A mapping that describes how to persist a relationship ([http://publib.boulder.ibm.com/infocenter/rtnlhelp/v6r0m0/index.jsp?topic=/com.ibm.xtools.viz.cpp.doc/topics/cassociation.html association], [http://en.wikipedia.org/wiki/Object_composition#Aggregation aggregation], or [http://publib.boulder.ibm.com/infocenter/rtnlhelp/v6r0m0/index.jsp?topic=/com.ibm.xtools.modeler.doc/topics/ccompasn.html composition]) between two or more objects.[3]


==See Also==
==See Also==
* [http://en.wikipedia.org/wiki/Monitor_(synchronization) Monitor Object]
 
* [http://en.wikipedia.org/wiki/Active_Object Active Object]
* [http://www.joeyoder.com/Research/objectmappings/mappingobjectstordbms/ppframe.htm SQL Code Description]
* [http://en.wikipedia.org/wiki/Thread_pool_pattern Thread pool pattern]
* [http://www.joeyoder.com/Research/objectmappings/mappingobjectstordbms/ppframe.htm Attribute Mapping Methods]
* [http://en.wikipedia.org/wiki/Reactor_pattern Reactor patter]
* [http://www.joeyoder.com/Research/objectmappings/mappingobjectstordbms/ppframe.htm OID Manager(Key Manager)]
* [http://en.wikipedia.org/wiki/Double_buffering Double Buffering]
* [http://www.joeyoder.com/Research/objectmappings/mappingobjectstordbms/ppframe.htm Connection Manager]
* [http://en.wikipedia.org/wiki/Scheduler_pattern Scheduler]
* [http://msdn.microsoft.com/en-us/library/ms724368(VS.85).aspx Asynchronous Processing]


==References==
==References==
Line 100: Line 276:


[11] http://www.adobe.com/newsletters/edge/october2008/articles/article2/index.html?trackingid=DWZST
[11] http://www.adobe.com/newsletters/edge/october2008/articles/article2/index.html?trackingid=DWZST
[12] Joseph W. Yoder, Patterns for Making your Business Objects Persistent in a Relational Database http://www.joeyoder.com/Research/objectmappings/mappingobjectstordbms/ppframe.htm
[13] Anatoliy Doroshenko, Vlad Romanenko, Object-Relational Mapping Techniques for .Net Framework http://subs.emis.de/LNI/Proceedings/Proceedings48/GI.Band.48-6.pdf
[14] Paul Charsley, Persistent Object Storage in the File System: How it works and its use as a
database teaching tool www.in-site.co.nz/misc_links/papers/charsley51.pdf
[15] Reese, G. (1997), Database Programming with JDBC and Java. O’Reilly. pp 80 – 82, 96 – 97.
[16] Scott W. Ambler, The Design of a Robust Persistence Layer For Relational Databases.
[17] Wolfgang Keller, Persistence Options for Object-Oriented Programs.

Latest revision as of 05:45, 24 November 2009

Patterns for mapping objects to relational databases

The Object-Relational Mismatch

Object-orientation and relational model are different paradigms of programming.[1] The object paradigm is based on building applications out of objects that have both data and behavior, whereas the relational paradigm is based on storing data. With the object paradigm, one traverses objects via their relationships whereas with the relational paradigm one duplicates data to join the rows in tables.[2] The difference between how object models work and relational databases is known as the “Object-Relational Impedance Mismatch”. This is just nerd-speak for “they are different”. In analogy, we could consider the example of the ball and shape game. It's the same thing; you can’t fit a square in a triangle hole.[10]

One way to avoid the impedance mismatch between objects and relations is to use an object-oriented database. However, systems often need to store objects in a relational database. Sometimes a system needs relational calculus or the maturity of a relational database. Other times the corporate policy is to use a relational database rather than an object-oriented database. Whatever the reason, a system that stores objects in a relational database needs to provide a design that reduces the impedance mismatch.[5] One of the secrets of success for mapping objects to relational databases is to understand both paradigms, and their differences, and then make intelligent trade offs based on that knowledge.[2] An application that maps between the two paradigms needs to be designed with respect to performance, maintainability and cost to name just a few requirements.[4]

Object-relational mapping.
Object-relational mapping.

Persistence frameworks were designed to reduce the amount of work needed to develop an object-oriented system that stores data in a relational database. For this reason, this framework area is often also referred to as object-relational mapping or ORM (see the below Figure). The goal of a persistence framework is to automatically map objects to and from a database seamlessly so that information can be quickly and easily retrieved in object format, and then subsequently stored in its relational structure. This aids in system development and maintenance greatly. The mundane details of writing one's own create, read, update, and delete functions are taken care of, so one can concentrate on designing and building the application.[11]

Patterns

This page describes patterns for mapping objects to relations.

Persistence Layer

Motivation:

If one builds a large object-oriented business system that stores objects in a relational database, one can spend a lot of time dealing with the problems of making one's objects persistent. Also, every programmer working on the system has to know SQL and the code can become tied to the database. It can be a lot of work to convert the system from using Microsoft Access to using DB2, or even adding a few variables to an object. One needs to separate one's domain knowledge from knowledge of how objects are stored in a database to protect developers from these types of changes. The problem is how to save objects in a non object-oriented storage mechanism such as a relational database with the developers not having to know the exact implementation.[5]

Solution:

The solution is to provide a Persistence Layer that can populate objects from a data storage source and save their data back to the data storage source. This layer encapsulates the behavior needed to make objects persistent, in other words to read, write, and delete objects to/from permanent storage. Persistence layer allows application developers to concentrate on what they do best, develop applications, without having to worry about how their objects will be stored. Furthermore, persistence layers should also allow database administrators (DBAs) to do what they do best, administer databases, without having to worry about accidentally introducing bugs into existing applications. With a well-built persistence layer DBAs should be able to move tables, rename tables, rename columns, and reorganize tables without affecting the applications that access them.[16]

Example Implementation:

The figure below presents a high-level design (Ambler, 1998b) of a robust persistence layer. An interesting feature of the design is that an application programmer only needs to know about the following classes to make their objects persistent: PersistentObject, the PersistentCriteria class hierarchy, PersistentTransaction, and Cursor. The other classes are not directly accessed by application development code but will still need to be developed and maintained to support the "public" classes.[16]

Overview of the design for a persistence layer. [16]

The PersistentObject Class encapsulates the behavior needed to make a single object persistent and is the class from which all classes in your problem/business domain inherit from. The PersistentObject has three attributes, isProxy, isPersistent, and timeStamp which respectively indicate whether or not an object is a proxy, if it was retrieved from a persistence mechanism, and the timeStamp assigned by the persistence mechanism for when it was last accessed by your application. The attribute isPersistent is important because an object needs to know if it already exists in the persistence mechanism or if it was newly created, information that is used to determine if an insert or update SQL statement needs to be generated when saving the object. The timeStamp attribute is used to support optimistic locking in the persistence mechanism. When the object is read into memory its timeStamp is updated in the persistence mechanism. When the object is subsequently written back the timeStamp is first read in and compared with the initial value – if the value of timeStamp has changed then another user has worked with the object and there is effectively a collision which needs to be rectified (typically via the display of a message to the user).

PersistentObject implements three methods – save(), delete(), and retrieve() – messages which are sent to objects to make them persistent. The implication is that application programmers don’t need to have any knowledge of the persistence strategy to make objects persistent, instead they merely send objects messages and they do the right thing. This is what encapsulation is all about. PersistentObject potentially maintains a relationship to an instance of OID, which is done whenever object IDs are used for the unique keys for objects in the persistence mechanism. This is optional because you don’t always have the choice to use object IDs for keys, very often you are forced to map objects to a legacy schema. The need to map to legacy schemas is an unfortunate reality in the object-oriented development world, something that we’ll discuss later in this white paper we look at how the map classes are implemented. Anyway, you can easily have PersistentObject automatically assign object IDs to your objects when they are created if you have control over your persistence schema. [16]

Advantages & Disadvantages:

An important benefit of isolating the application developer from the details of how an object saves itself is that it makes it easier to implement domain objects. Thus, it is less work to evolve the domain model. By encapsulating the functionality of the object persistence mechanism, a developer will effectively be hidden from the details of saving the object. However, a Persistent Layer can make it complicated and sometimes difficult to do operations that might be easy to write in SQL Code. Also, optimizations can be difficult in a Persistent Layer.[5]

CRUD

Motivation:

Consider the example of a Patient class with components of class Name and Address. When one reads a Patient, one must also read Name and Address. Writing out a Patient will probably cause writing Name and Address objects to the database. The question that arises here is should they all have the same interface for reading and writing. Maybe some object requires a different interface to the database. Can we give them all the same interface? If so, what should it be?

Any persisted object needs operations to read from and write to the database. Values being persisted may also be for newly created objects. Sometimes, objects may need to be deleted from the persistence storage. Therefore, anytime an object needs to be persistent, it is important to provide, at a minimum, create, read, update, and delete operations. The question (and the problem) is what minimal operations are needed for a persistent object.[5]

Solution:

The solution is to provide the basic CRUD (create, read, update, and delete) operations for persistent objects. The important point is to provide at least enough to instantiate objects from a database and store newly created or changed objects. If all domain objects have a common PersistentObject superclass, then this class can define the CRUD operations and all domain objects can inherit them. Subclasses can override them if necessary to increase performance. Thus, each domain object must make available a description of the SQL code necessary for accessing the database available for the CRUD operations. Thus CRUD works closely with a SQL Code Description for insuring that the operations are sufficient for persisting domain objects.[5]

Example Implementation:

Create. [17] Read Update.jpg. [17] Delete. [17]

Advantages & Disadvantages:

Once the object model and data model have been analyzed, the results can be implemented in CRUD providing a performance-optimized solution, thus isolating the developer from having to worry about performance details. Flexibility to retrieve data based on how many rows or what kind of data (dynamic, static, or somewhere in-between) is needed. It has a simple implementation to SAVE the data to, or back to, the database. The application developer does not have to determine whether to insert or update the object. However, CRUD can cause sub-optimal performance if the object model and data model have not been properly analyzed. This will make the job of the developer more difficult if they have to compensate another way.[5]

Type Conversion

Motivation:

The values in a database do not always map directly into object types. For example, a boolean value may be saved as a “T” or “F” in a database.[13] Type Conversion allows for object values to be converted to and from database values.[5] The problem here is it should be possible to store in database object properties that have data types with no direct mapping to database types.[13]

Solution:

The solution would be to have all values convert their respective types through a Type Conversion object. This pattern ensures the data retrieved from the database is appropriate for the object. Each object attribute is passed through the appropriate Type Converter to apply the data rules necessary for proper application or DBM use.[5]

Example Implementation:

To consider an example of the above pattern, we consider the following approach based on the Microsoft .Net platform. Each data type has its own converter class. Base abstract class for each converter is SpfDataType class. It implements the following functionality: [13]

public class SpfDataType
{
   public virtual bool IsValidValue(Object p_Value);
   /// validate if value is correct
   public virtual string GetErrorMessage(
   object p_FieldValue);
   /// error message for incorrect value
   public bool IsNullable;
   /// is NULL allowed?
   public virtual object GetStorageValue(object p_Value);
   /// type conversion for database storage
   public abstract object ConvertValue(object p_Value);
   /// type conversion for object
   public abstract Type BaseType;
   /// basic .Net type for this type
   public virtual int StorageLength;
   /// storage data length
   public virtual string GetFormattedValue(object p_Value);
   /// formatted string value
}

Converters are associated with object fields through attributes as shown in the above example. To use a custom type in business object with persistence framework a custom class derived from SpfDataType should be defined and appropriate fields should be marked with this new converter. Converters for all .Net build-in types are part of the framework.[13]

Advantages & Disadvantages:

Type Conversion can help insure data consistency, object types can vary regardless of the database types and default values can be assigned for empty values from the database. It prevents “Undefined object does not understand” errors and provides increased RMA for the applications. (Reliability, Maintainability, and Accessibility). However, it can be time consuming to convert types, especially when reading in many values from a database.[5]


Change Manager

Motivation:

Many objects need access to shared values, but the values are not unique throughout the system. The problem here is how to tell that an object has changed and needs to be saved to the data source; how to prevent unnecessary access to the data source.[12] In general, a Persistence Layer should keep track of all PersistentObjects that have changed state, and should make sure that they are all written back to the database.[5]

Solution:

The solution is to set up a Change Manager that keeps track of any PersistentObject that changes one of its persistent attributes. This Change Manager will be used whenever any requests to saving objects is needed. One way to do this is by inheriting from a PersistentObject, which has a dirty bit that gets set whenever one of the attributes that maps to the database changes. This dirty bit is usually an instance variable with a boolean value which indicates when an objects values have changed. When the boolean value is set, the PersistentObject will save the new values to the database when a request to save it is invoked. If the boolean value is not set, the PersistentObject will bypass the write to the database.

Data access is generally very expensive and should be used sparingly. By being able to identify that an object does or does not need to be written to the database can increase performance drastically. In addition to performance considerations, the user interface can benefit by being able to prompt the user to save before exiting. This ability adds to the user accepting your application by knowing that if they forget to save the changed information, the system will prompt them to save it. The user can come to the conclusion that the application is useless very quickly if they have to reenter the same data more than once.

Another feature that the Change Manager can provide is that of remembering original state or changed state of an object. If the user of your system needs to be able to revert back to original state, the Change Manager could keep around the original values so that an undo call could be invoked. Also, you might want to provide your system with the ability to have multiple undos.[5]

Example Implementation:

The example will show how an accessor method for the first name attribute in the Name class sets the dirty bit whenever the attributes changes its value. The accessor is the default generated by VisualAge with the addition of the makeDirty call, which sets the inherited attribute of isChanged to be true.

first: aString
   "Save the value of first."
   self makeDirty.
   first := aString.
   self signalEvent: #first
          with: aString.

Protocol for Change Manager PersistentObject (instance) These methods provide the Persistence Layer with the ability to change the dirty flag. This saves the Persistence Layer from having to write data to the database that has not changed. It also provides the GUI programmer a way to test the object in order to provide the user a message to save their data or not.

makeDirty
   "Indicates an object needs to be saved to the db. This method can be called from the ‘setter’ method as 
   in the example above.”
   isChanged:=true.
makeClean
   "Indicates that the object does not need to be saved to the db or that no changes have affected the 
   object"
   isChanged:=false.

[5]

Advantages:

The user will more readily accept the application. It will result in a better performance by not writing object that has not changed back to the database. When merging data between databases the flag can be set so the record will be inserted to the new database as necessary.[5]

Transaction Manager

Motivation:

When saving objects, it is important to allow for objects to be stored in such a way that if some values are not stored properly, previously stored objects can be rolled back.[5] Also, it is important to have an ability to lock a set of objects that a particular user is updating or deleting if a business system is to support multiple users accessing data simultaneously.[14]

Solution:

Reese (1997) introduced the concept of a Lockholder who owns a lock object that is passed to every business object that the Lockholder might change. A business application for an academic records system might instantiate a Person object for each user that logs in to the system. In order to “become” a Lockholder, the Person object would need to “implement” a Lockholder Interface. In Java, an Interface is simply a set of empty methods and any object that wishes to implement an Interface has to provide method bodies for all of them (it implements the methods). A unique lock object would be created for the Lockholder, which would be passed as a parameter to the business object every time the Lockholder tries to take possession of an object. If another Lockholder attempts to modify an object held by that Lockholder, a lock error would be thrown since a comparison of the lock already associated with the object and that owned by the new Lockholder would reveal that they were not the same. Most operations that are performed on an object oriented business system involve making changes or creating more than one object at a time. Reese (1997) used the concept of a Transaction object that is held by the Lock object. The Transaction object is aware of all the business objects that have been locked by the Lock object. When changes are made to business objects, the Transaction object only commits these changes to the data store if all of them were successful.[14]

Example Implementation:

The following code is from excerpts from the PersistentObject. These show the wrapper effect for implementing transaction management. The code (self class beginTranaction) tells the database to start and stop a transaction will be dependant on which database you are using and what language you are developing in. You may need to extend transactions to handle your specific implementation. Also, if the database you are saving to (maybe a flat file) doesn’t support transactions, you are going to have to develop a complete Transaction Manager to support the needs of your users. This example wraps all saves and deletes to an object with a beginTransaction and an endTransaction. The class methods make the call to the database connection class.

Protocol for Public Interface PersistentObject (instance)

save
   self class beginTransaction.
   self saveAsTransaction.
   self class endTransaction.
delete
   self class beginTransaction.
   self deleteAsTransaction.
   self class endTransaction.

Protocol for Public Interface PersistentObject (class)

beginTransaction
   self databaseConnection beginUnitOfWorkIfError:
                        [ self databaseConnection rollbackUnitOfWork ]
endTransaction
   self databaseConnection commitUnitOfWork
rollBackTransaction
   self databaseConnection rollbackUnitOfWork

[5]

Advantages & Disadvantages:

The advantages of the Transaction Manager are that the complex objects are saved completely or not at all and referential integrity is maintained. However, all other applications writing to the database have to use the Transaction Manager. You may have to add checks to see if anyone else has changed the database values that you are saving. Unless there is support built in from the database, writing a complete transaction manager is hard.[5]

Other Patterns

The other patterns include SQL Code Description, Attribute Mapping Methods, OID Manager and Connection Manager. We provide a brief description of these patterns here in order to understand the whole picture of how all the patterns collaborate with each other. (See "Putting It All Together").

The SQL Code Description defines the actual SQL code that takes the values from the RDBMS or other database and retrieves them for the object’s use and vice-versa. It is used to generate the SQL for performing the CRUD operations. The Attribute Mapping Methods pattern maps the values between the values from columns in the database and values being stored in an object's attributes. This pattern also handles complex object mappings. It populates the object(s) with the row values. Since every object is unique in any object oriented system, it is important to create unique identifiers for new objects with an OID Manager. Any system accessing a RDMBS will need to provide connections to the desired databases through some sort of Connection Manager.[5]

Putting It All Together (Comparison of all patterns)

All of these patterns collaborate together to provide a mechanism for mapping persistent objects to a database. The figure below shows how the patterns interact with one another. The Persistence Layer provides the standard interface for the CRUD (create, read, update, and delete) operations needed to persist domain objects. The Persistence Layer builds the calls to the database by using the SQL Code Description provided by the domain objects. During the generation of the SQL code, the Persistence Layer interacts with the Table Manager to get the correct database table and column names. When values are being returned from the database or written back to the database, attributes values must be mapped to database column names and vice-versa. This is done with Attribute Mapping Methods. Attribute Mapping Methods do some Type Conversions while the SQL code is being generated. Attribute Mappings and Type Conversions can also happen while the Persistence Layer instantiates a new object. The Persistence Layer saves objects to the database through the Connection Manager only when an object’s value has changed. This change is managed by the Change Manager. The Connection Manager could interact with the Table Manager to decide on which database to use. The Persistence Layer provides access to a Transaction Manager whenever transaction processing is needed.[5]

Pattern Interaction Diagram. [5]

Mapping Terminology

Mapping (v). The act of determining how objects and their relationships are persisted in permanent data storage, in this case relational databases.[3]

Mapping (n). The definition of how an object’s property or a relationship is persisted in permanent storage.[3]

Property. A data attribute, either implemented as a physical attribute such as the string firstName or as a virtual attribute implemented via an operation such as getTotal() which returns the total of an order.[3]

Property mapping. A mapping that describes how to persist an object’s property.[3]

Relationship mapping. A mapping that describes how to persist a relationship (association, aggregation, or composition) between two or more objects.[3]

See Also

References

[1] Wolfgang Keller, Mapping Objects to Tables - A Pattern Language http://www.objectarchitects.de/ObjectArchitects/papers/Published/ZippedPapers/mappings04.pdf

[2] Scott W. Ambler, President, Ronin International, Mapping Objects To Relational Databases http://www.crionics.com/products/opensource/faq/docs/mappingObjects.pdf

[3] Scott W. Ambler, Mapping Objects to Relational Databases: O/R Mapping In Detail http://www.agiledata.org/essays/mappingObjects.html

[4] Wolfgang Keller, Object/Relational Access Layers - A Roadmap, Missing Links and More Patterns http://citeseer.ist.psu.edu/4328.html

[5] Joseph W. Yoder, Ralph E. Johnson, Quince D. Wilson, Connecting Business Objects to Relational Databases http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.7703&rep=rep1&type=pdf

[6] Michael R. Blaha, William J. Premerlani and James E. Rumbaugh, Relational Database Design using an Object-Oriented Methodology http://www.sims.monash.edu.au/subjects/ims2501/seminars/oomodelling.pdf

[7] http://www.cetus-links.org/oo_db_systems_3.html

[8] http://www.ibm.com/developerworks/library/ws-mapping-to-rdb/

[9] http://www.objectarchitects.de/ObjectArchitects/orpatterns/index.htm?MappingObjects2Tables/mapping_object.htm

[10] http://cantgrokwontgrok.blogspot.com/2009/03/tech-day-1-nhibernate.html

[11] http://www.adobe.com/newsletters/edge/october2008/articles/article2/index.html?trackingid=DWZST

[12] Joseph W. Yoder, Patterns for Making your Business Objects Persistent in a Relational Database http://www.joeyoder.com/Research/objectmappings/mappingobjectstordbms/ppframe.htm

[13] Anatoliy Doroshenko, Vlad Romanenko, Object-Relational Mapping Techniques for .Net Framework http://subs.emis.de/LNI/Proceedings/Proceedings48/GI.Band.48-6.pdf

[14] Paul Charsley, Persistent Object Storage in the File System: How it works and its use as a database teaching tool www.in-site.co.nz/misc_links/papers/charsley51.pdf

[15] Reese, G. (1997), Database Programming with JDBC and Java. O’Reilly. pp 80 – 82, 96 – 97.

[16] Scott W. Ambler, The Design of a Robust Persistence Layer For Relational Databases.

[17] Wolfgang Keller, Persistence Options for Object-Oriented Programs.