Document Object Model

The Document Object Model is the language independent and platform independent interface which allows to interact with the document. Typically this interaction is with the objects of the HTML, XHTML and XML documents. DOM can change the content, style and structure of such documents. DOM API is provided in different languages to dynamically change the documents.^[1]

DOM structure

DOM is a programming API for documents. DOM builds tree for each document and then it traverses that tree to read different nodes of the tree. This tree is HTML/XML like structure where each node represents attribute, element, content or some other object. Each of these nodes implements node interface. Node interface has prototype of the methods which are useful to traverse the tree and read or modify the node contents. Below is the sample node tree structure.^[2]

Different nodes in the above tree structure can be accessed using properties provided by Node interface. e.g. NodeA.firstChild = NodeA1 NodeA.childNodes[1] = NodeA2 NodeA.firstChild .firstChild = NodeA1a

The Node interface provides methods to dynamically change the tree structure or node content. e.g. insertBefore() method is used to insert new node before the specified node. The Document object model has document as the root. It also implements Node interface. Using methods such as getElementById(),getElementsByName() one can randomly access the elements in the tree where document is the root.^[3]

DOM in HTML, XML, Javascript

DOM is used for the objects in HTML or XML. DOM parser is used in Javascript to access or change these objects dynamically.

DOM in HTML

DOM creates the tree structure for the HTML page. In DOM tree, base <HTML> tag corresponds to the root node in tree structure. Consider the below HTML syntax for table^[4]

<table>
<tr>
<td>CSC517</td> <td>Dr. Gehringer</td> </tr> <tr> <td>CSC540</td> <td>Dr. Ogan</td> </tr> </table>

The corresponding DOM tree for this is,

Thus for every HTML page DOM builds the tree which can be later accessed using different methods provided by node interface.

DOM in XML

DOM also provides interface which works with XML. Using DOM one can access the XML nodes randomly also it provides method which allows to change the contents or structure of the XML document dynamically. Consider the following example of XML^[5],

<bookstore>
 <book>
   <title>Everyday Italian</title>
   <author>Giada De Laurentiis</author>
 </book>
 <book>
   <title>Harry Potter</title>
   <author>J.K. Rowling</author>
 </book>
</bookstore>

The corresponding DOM tree for this is,

DOM in Javascript

Javascript uses DOM to access or change the HTML/XML objects dynamically. Consider the below example of table in HTML

CSC517	Dr. Gehringer
CSC540	Dr. Ogan

To change the content of first cell from CSC517 to ECE517 in javascript, DOM is used. It can access the content of that cell using getElementById(). There is innerHTML property which allows to change the content of corresponding node. So document.getElementById("course1").innerHTML="ECE517"; will change the cell content from CSC517 to ECE517. There are other methods like getElementsByTagName() which also can be used for accessing the node.

Relation between DOM and other object models

DOM is one of the object model. At first, it sounds out of context to refer DOM as one of the object models but it is based on object concept. Though the object concept associated with DOM is different from the standard object concept associated with object oriented languages like Java, C++. The term object model can be used in two different contexts^[6]. The first one is associated with object oriented languages like Java, C++. These object models are described with inheritance, encapsulation, abstraction etc. The second is the one in which programs can access, manipulate the specific parts of the context using a collection of classes or objects. Document Object Model fits in this concept. DOM is nothing but the collection of objects which represents the HTML/XHTML/XML document.

Any object model has three key concepts

data structures that can be used to represent the object state
ways to associate behaviour with the object state
ways for the object methods to access and operate on that state

The name "Document Object Model" was chosen because it is an "object model" in the traditional object oriented design sense: documents are modeled using objects, and the model encompasses not only the structure of a document, but also the behavior of a document and the objects of which it is composed. In other words, the nodes in the above diagram do not represent a data structure, they represent objects, which have functions and identity. As an object model, the DOM identifies:

the interfaces and objects used to represent and manipulate a document
the semantics of these interfaces and objects - including both behavior and attributes
the relationships and collaborations among the interfaces and objects

Thus DOM is based on Object concept as it uses methods and properties of objects to modify or access the documents.

Other Competing Solutions

DOM is the popular parser for XML/HTML documents but there are other competing solutions. Simple API for XML(SAX), Java Architecture for XML Binding(JAXB) are other popular solutions.

Simple API for XML (SAX)

The Simple API for XML (SAX) is the event-driven, serial-access mechanism that does element-by-element processing. SAX allows you to process a document as it's being read, which avoids the need to wait for all of it to be stored before taking action. Being an event-based interface, the parser reports events whenever it sees a tag/attribute/text node/unresolved external entity/other. SAX is a streaming interface — applications receive information from XML documents in a continuous stream, with no backtracking or navigation allowed. This approach makes SAX extremely efficient, handing XML documents of nearly any size in linear time and near-constant memory, but it also places greater demands on the software developer's skills. As a result, the programmer has to attach “event handlers” to handle the events.

SAX is memory efficient since the application can discard the useless portions of the document and only keep the small portion that is of interests to the application. A SAX parser can achieve constant memory usage thus easily handle very large documents. Because of this, SAX is more suited for processing local information coming from nodes that are close to each other. Since SAX provides the document information to the application as a series of events, it is difficult for the application to handle global operations across the document. For such complex operations, the application would have to build its own data structure to store the document information.

DOM vs SAX

Comparison between DOM and SAX

Document Object Model (DOM)	Simple API for XML (SAX)
DOM is a tree based interface	SAX is an event driven interface
Takes significant amount of memory	More memory efficient
Convenient for random accessing	Appropriate for addressing local information
Can read and modify the document	Can only read XML Document
Has to wait before the entire document tree gets loaded in the memory before doing any operation	Good for streaming applications since the applications can start processing from the beginning

Java Architecture for XML Binding (JAXB)

The Java Architecture for XML Binding (JAXB) provides a fast and convenient way of working on XML content within Java applications. It provides a unique way of binding XML data to Java representations. The major functionality the JAXB provides are marshalling and unmarshalling. Marshalling refers to constructing Java objects from the document. Unmarshalling is the process of building up a document from Java object. With the help of these it becomes easier for Java developers to focus on the business logic rather than XML processing.

DOM vs JAXB

Comparison between DOM and JAXB parser

Document Object Model (DOM)	Java Architecture for XML Binding (JAXB)
Transformation of DOM tree to XML	Marshalling of Java Objects to XML
Transformation of XML Document to DOM tree	Unmarshalling of XML data to Java Objects
Not driven by a schema and transformation is done through XML serialization process	Driven by a schema as data-binding is used for the mapping between XML Documents and Java classes
Application needs to know XML processing	Makes application focus on business logic rather than details of XML
Need to navigate through a tree to access data.	Allows to access data in non sequential order without requiring serial navigation
Memory intensive as entire document tree is held in memory, making it incapable in handling very large documents.	Efficient memory usage as the tree of content objects produced through JAXB tends to be more efficient in terms of memory use than DOM-based trees.
Used for parsing an XML document	Defines a binding between XML schema and corresponding object heirarchy

CSC/ECE 517 Fall 2010/ch6 6g SL

Document Object Model

Contents

DOM structure

DOM in HTML, XML, Javascript

DOM in HTML

DOM in XML

DOM in Javascript

Relation between DOM and other object models

Other Competing Solutions

Simple API for XML (SAX)

DOM vs SAX

Java Architecture for XML Binding (JAXB)

DOM vs JAXB

References

Navigation menu

CSC/ECE 517 Fall 2010/ch6 6g SL

Document Object Model

DOM structure

DOM in HTML, XML, Javascript

DOM in HTML

DOM in XML

DOM in Javascript

Relation between DOM and other object models

Other Competing Solutions

Simple API for XML (SAX)

DOM vs SAX

Java Architecture for XML Binding (JAXB)

DOM vs JAXB

References

Navigation menu

Search