CSC/ECE 517 Fall 2010/ch6 6g SL: Difference between revisions
(56 intermediate revisions by 2 users not shown) | |||
Line 5: | Line 5: | ||
== DOM structure == | == DOM structure == | ||
DOM is | DOM is an API for HTML, XHTML and XML documents. DOM builds tree for each document and then traverses that tree to read different nodes of the tree. This tree is HTML/XML like structure where each node represents attribute, element, content or some other object. Each of these nodes implements <code>Node</code> interface. <code>Node</code> interface has the prototype of the methods and the properties which are used to access or modify the tree. Below is the sample node tree structure.<sup>[http://www.brainjar.com/dhtml/intro/default.asp]</sup> | ||
[[Image:Node_Structure.png]] | [[Image:Node_Structure.png]] | ||
Different nodes in the above tree structure can be accessed using properties provided by Node interface. | Different nodes in the above tree structure can be accessed using properties provided by <code>Node</code> interface. | ||
For example, | |||
<code>NodeA.firstChild = NodeA1</code> | |||
<code>NodeA.childNodes[1] = NodeA2</code> | |||
<code>NodeA.firstChild .firstChild = NodeA1a</code> | |||
DOM provides methods to dynamically change the tree structure or node content. | |||
For example, <code>insertBefore(newElement, referenceElement)</code> method is used to insert a new node <code>newElement</code> before <code>referenceElement</code>. | |||
DOM tree for the HTML document has <code>document</code> as its root. This root node implements <code>Node</code> interface. Hence, the methods like <code>document.getElementById()</code>, <code>document.getElementsByName()</code> are used to access the elements in the tree.<sup>[http://www.brainjar.com/dhtml/intro/default2.asp]</sup> | |||
== DOM in HTML, XML, Javascript == | == DOM in HTML, XML, Javascript == | ||
DOM | DOM treats HTML or XML documents as objects. DOM parser is used in Javascript to access or change these objects dynamically. | ||
=== DOM in HTML === | === DOM in HTML === | ||
DOM creates the tree structure for the HTML | DOM creates the tree structure for the HTML document. In DOM tree, base <code><HTML></code> tag corresponds to the root node in the tree structure. Consider the below HTML snippet for table<sup>[http://www.w3.org/TR/DOM-Level-2-Core/introduction.html]</sup> | ||
<code><pre> | |||
<nowiki> | <nowiki> | ||
<table> | <table> | ||
<tr> | <tr> | ||
<td>CSC517</td> | <td>CSC517</td> | ||
<td>Dr. Gehringer</td> | <td>Dr. Gehringer</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>CSC540</td> | <td>CSC540</td> | ||
<td>Dr. Ogan</td> | <td>Dr. Ogan</td> | ||
</tr> | </tr> | ||
</table> | </table> | ||
</nowiki> | </nowiki> | ||
</pre></code> | |||
The corresponding DOM tree for this is, | The corresponding DOM tree for this is, | ||
Line 43: | Line 49: | ||
[[Image:HTML_Tree_Structure.png|width|500px]] | [[Image:HTML_Tree_Structure.png|width|500px]] | ||
Thus for every HTML | Thus, for every HTML document, DOM builds the tree which can later be accessed or modified easily. | ||
=== DOM in XML === | === DOM in XML === | ||
DOM also provides interface which works with XML. Using DOM | DOM also provides an interface which works with XML documents. Using DOM, XML nodes can be accessed randomly and the contents or structure of the XML document can be changed dynamically. Consider the following example of XML<sup>[http://www.w3schools.com/dom/dom_nodes.asp]</sup>, | ||
<code><pre> | |||
<nowiki> | |||
<bookstore> | <bookstore> | ||
<book> | <book> | ||
Line 58: | Line 66: | ||
</book> | </book> | ||
</bookstore> | </bookstore> | ||
</nowiki> | |||
</pre></code> | |||
The corresponding DOM tree for this is, | The corresponding DOM tree for this is, | ||
[[Image:XML_Tree_Structure.png|width|500px]] | [[Image:XML_Tree_Structure.png|width|500px]] | ||
Thus, the tree built by DOM is used to access or modify the XML document. | |||
=== DOM in Javascript === | === DOM in Javascript === | ||
[http://en.wikipedia.org/wiki/JavaScript Javascript] uses DOM to access or change the HTML/XML objects dynamically. Consider the below example of table in HTML | [http://en.wikipedia.org/wiki/JavaScript Javascript] uses DOM to access or change the HTML/XML objects dynamically. Consider the below example of table in HTML | ||
<code><pre> | |||
<nowiki> | |||
<TABLE> | <TABLE> | ||
<TR> | <TR> | ||
Line 77: | Line 90: | ||
</TR> | </TR> | ||
</TABLE> | </TABLE> | ||
</nowiki> | |||
</pre></code> | |||
In javascript, DOM can be used to change the content of first cell from CSC517 to ECE517. It can access the content of that cell using <code>document.getElementById("course1")</code> method. There is <code>innerHTML</code> property which allows to change the content of corresponding node dynamically. So <code>document.getElementById("course1").innerHTML="ECE517";</code> will change the cell content from CSC517 to ECE517. There are other methods like <code>getElementsByTagName()</code> which can also be used for accessing the node. | |||
== | == Relation between DOM and other object models == | ||
At first, DOM looks out of place when considered to be one of the object models. However, it is based on the object concept which is different from the traditional object concept associated with object oriented languages like [http://en.wikipedia.org/wiki/Java_(programming_language) Java], [http://en.wikipedia.org/wiki/C%2B%2B C++]. | |||
The term ''object model'' can be used in two different contexts<sup>[http://en.wikipedia.org/wiki/Object_model]</sup>. | The term ''object model'' can be used in two different contexts<sup>[http://en.wikipedia.org/wiki/Object_model]</sup>. | ||
The first one is associated with object oriented languages like Java, C++. These object models are described with inheritance, encapsulation, abstraction etc. | |||
The second is the one in which programs can access, manipulate the specific parts of the context using a collection of classes or objects. Document Object Model fits in this concept. DOM is nothing but the collection of objects which represents the HTML/XHTML/XML document. | The first one is associated with object oriented languages like Java, C++. These object models are described with inheritance, encapsulation, abstraction, etc. | ||
The second is the one in which programs can access, manipulate the specific parts of the context using a collection of classes or objects. Document Object Model fits in this concept. DOM is nothing but the collection of objects which represents the HTML/XHTML/XML document[8]. | |||
Any object model has three key concepts | Any object model has three key concepts | ||
* data structures that can be used to represent the object state | * data structures that can be used to represent the object state | ||
* | * behavior associated with the object state | ||
* | * methods to access and operate on the object state | ||
The name "Document Object Model" | The name "Document Object Model" is chosen because documents are modeled using objects, and the model encompasses not only the structure of the document, but also the behavior of a document. In other words, the nodes in the DOM tree do not represent a data structure, they represent objects which have functions and properties. As an object model, the DOM identifies: | ||
* the interfaces and objects used to represent and manipulate a document | * the interfaces and the objects used to represent and manipulate a document | ||
* the semantics of these interfaces and objects - including both behavior and attributes | * the semantics of these interfaces and objects - including both behavior and attributes | ||
* the relationships and collaborations among the interfaces and objects | * the relationships and collaborations among the interfaces and objects | ||
Thus, DOM facilitates accessing and modifying the document using properties and behavior of object. | |||
== Other Competing Solutions == | == Other Competing Solutions == | ||
DOM is the popular parser for XML/HTML documents but there are other competing solutions | DOM is the popular parser for XML/HTML documents but there are other competing solutions like [http://en.wikipedia.org/wiki/Simple_API_for_XML Simple API for XML(SAX)], [http://en.wikipedia.org/wiki/JAXB Java Architecture for XML Binding(JAXB)]. | ||
=== Simple API for XML (SAX) === | === Simple API for XML (SAX) === | ||
The Simple API for XML (SAX) is the event-driven | The Simple API for XML (SAX) is the event-driven mechanism which employs serial access to the elements. SAX allows processing a document as it's being read and this avoids the waiting time involved to store the document before taking any action. Being an event-based interface, the parser reports events whenever it sees a tag/attribute/text node/unresolved external entity/other. | ||
SAX | Applications using SAX receive information from XML documents in a continuous stream, with no backtracking or navigation allowed. As a result, applications need not store whole document. This makes SAX extremely efficient, allowing SAX to handle XML documents of nearly any size in linear time and near constant memory. During this process, programmer needs to use his programming skills as he has to attach event handlers to handle the events. | ||
SAX | The applications using SAX can discard the useless portions of the document and only keep the small portion that is of interest to the application. The constant memory usage in SAX enables it to handle very large documents. Therefore, SAX is more suited for processing local information coming from nodes that are close to each other. Since SAX provides the document information to the application as a series of events, handling global context of the document is difficult. For such complex operations, building own data structure can solve the problem. | ||
==== DOM vs SAX ==== | ==== DOM vs SAX ==== | ||
Comparison between DOM and SAX | Comparison between DOM and SAX[9]. | ||
{| class="wikitable" style="width: 75%;" border="1" | {| class="wikitable" style="width: 75%;" border="1" | ||
|- | |- | ||
Line 122: | Line 140: | ||
|Appropriate for addressing local information | |Appropriate for addressing local information | ||
|- | |- | ||
|Can | |Can read and modify the document | ||
| | |Can only read XML Document | ||
|- | |- | ||
| | |Has to wait before the entire document tree gets loaded in the memory before doing any operation | ||
| | |Good for streaming applications since the applications can start processing from the beginning | ||
|} | |} | ||
=== Java Architecture for XML Binding (JAXB) === | === Java Architecture for XML Binding (JAXB) === | ||
The Java Architecture for XML Binding (JAXB) provides a fast and convenient way of working on XML content within Java applications. It provides a unique way of binding XML data to Java representations. The major | The Java Architecture for XML Binding (JAXB) provides a fast and convenient way of working on XML content within Java applications. It provides a unique way of binding XML data to Java representations. The major functionalities the JAXB provides are [http://en.wikipedia.org/wiki/Marshalling_(computer_science) marshalling] and unmarshalling. | ||
''Marshalling'' refers to constructing Java objects from the document. | ''Marshalling'' refers to constructing Java objects from the document. | ||
''Unmarshalling'' is the process of building up a document from Java object. With the help of these it becomes easier for Java developers to focus on the business logic rather than XML processing. | ''Unmarshalling'' is the process of building up a document from Java object. With the help of these it becomes easier for Java developers to focus on the business logic rather than XML processing. | ||
==== DOM vs JAXB ==== | ==== DOM vs JAXB ==== | ||
Comparison between DOM and JAXB parser | Comparison between DOM and JAXB parser<sup>[http://www.coderanch.com/t/220413/Web-Services/java/JAXB-DOM-SAX]</sup> | ||
{| class="wikitable" style="width: 75%;" border="1" | {| class="wikitable" style="width: 75%;" border="1" | ||
|- | |- | ||
Line 148: | Line 166: | ||
|Unmarshalling of XML data to Java Objects | |Unmarshalling of XML data to Java Objects | ||
|- | |- | ||
| | |Not driven by a schema and transformation is done through XML serialization process | ||
| | |Driven by a schema as data-binding is used for the mapping between XML Documents and Java classes | ||
|- | |- | ||
| | |Application needs to know XML processing | ||
| | |Makes application focus on business logic rather than details of XML | ||
|- | |- | ||
| | |Need to navigate through a tree to access data. | ||
|Allows to access data in non sequential order | |Allows to access data in non sequential order without requiring serial navigation | ||
|- | |- | ||
| | |Memory intensive as entire document tree is held in memory, making it incapable in handling very large documents. | ||
| | |Efficient memory usage as the tree of content objects produced through JAXB tends to be more efficient in terms of memory use than DOM-based trees. | ||
|- | |- | ||
| | |Used for parsing an XML document | ||
|Defines a binding between XML schema and corresponding object | |Defines a binding between XML schema and corresponding object hierarchy | ||
|} | |} | ||
Line 169: | Line 187: | ||
# [http://www.brainjar.com/dhtml/intro/default.asp DOM Structure] | # [http://www.brainjar.com/dhtml/intro/default.asp DOM Structure] | ||
# [http://www.brainjar.com/dhtml/intro/default2.asp DOM tree with document as root object] | # [http://www.brainjar.com/dhtml/intro/default2.asp DOM tree with document as root object] | ||
# [http://www.w3.org/TR/DOM-Level-2-Core/introduction.html DOM in HTML] | # Philippe Le Hégaret,Lauren Wood,Jonathan Robie,[http://www.w3.org/TR/DOM-Level-2-Core/introduction.html DOM in HTML],''World Wide Web Consortium'',2000 | ||
# [http://www.w3schools.com/dom/dom_nodes.asp DOM in XML] | # [http://www.w3schools.com/dom/dom_nodes.asp DOM in XML] | ||
# [http://en.wikipedia.org/wiki/Object_model Object Models] | # [http://en.wikipedia.org/wiki/Object_model Object Models] | ||
# [http://www.coderanch.com/t/220413/Web-Services/java/JAXB-DOM-SAX Comparison of JAXB with DOM] | |||
# Frank Manola,”Technologies for Web Object Model”,''IEEE Internet Computing'', 1999 | |||
# Chengkai Li, “XML Parsing, SAX/DOM”,''Encyclopedia of Database Systems'', 2009 |
Latest revision as of 03:14, 23 November 2010
Document Object Model
The Document Object Model is the language independent and platform independent interface which allows to interact with the document. Typically this interaction is with the objects of the HTML, XHTML and XML documents. DOM can change the content, style and structure of such documents. DOM API is provided in different languages to dynamically change the documents.[1]
DOM structure
DOM is an API for HTML, XHTML and XML documents. DOM builds tree for each document and then traverses that tree to read different nodes of the tree. This tree is HTML/XML like structure where each node represents attribute, element, content or some other object. Each of these nodes implements Node
interface. Node
interface has the prototype of the methods and the properties which are used to access or modify the tree. Below is the sample node tree structure.[2]
Different nodes in the above tree structure can be accessed using properties provided by Node
interface.
For example,
NodeA.firstChild = NodeA1
NodeA.childNodes[1] = NodeA2
NodeA.firstChild .firstChild = NodeA1a
DOM provides methods to dynamically change the tree structure or node content.
For example, insertBefore(newElement, referenceElement)
method is used to insert a new node newElement
before referenceElement
.
DOM tree for the HTML document has document
as its root. This root node implements Node
interface. Hence, the methods like document.getElementById()
, document.getElementsByName()
are used to access the elements in the tree.[3]
DOM in HTML, XML, Javascript
DOM treats HTML or XML documents as objects. DOM parser is used in Javascript to access or change these objects dynamically.
DOM in HTML
DOM creates the tree structure for the HTML document. In DOM tree, base <HTML>
tag corresponds to the root node in the tree structure. Consider the below HTML snippet for table[4]
<table>
<tr>
<td>CSC517</td>
<td>Dr. Gehringer</td>
</tr>
<tr>
<td>CSC540</td>
<td>Dr. Ogan</td>
</tr>
</table>
The corresponding DOM tree for this is,
Thus, for every HTML document, DOM builds the tree which can later be accessed or modified easily.
DOM in XML
DOM also provides an interface which works with XML documents. Using DOM, XML nodes can be accessed randomly and the contents or structure of the XML document can be changed dynamically. Consider the following example of XML[5],
<bookstore>
<book>
<title>Everyday Italian</title>
<author>Giada De Laurentiis</author>
</book>
<book>
<title>Harry Potter</title>
<author>J.K. Rowling</author>
</book>
</bookstore>
The corresponding DOM tree for this is,
Thus, the tree built by DOM is used to access or modify the XML document.
DOM in Javascript
Javascript uses DOM to access or change the HTML/XML objects dynamically. Consider the below example of table in HTML
<TABLE>
<TR>
<TD id='course1'>CSC517</TD>
<TD id='inst1'>Dr. Gehringer</TD>
</TR>
<TR>
<TD id='course2'>CSC540</TD>
<TD id='inst2'>Dr. Ogan</TD>
</TR>
</TABLE>
In javascript, DOM can be used to change the content of first cell from CSC517 to ECE517. It can access the content of that cell using document.getElementById("course1")
method. There is innerHTML
property which allows to change the content of corresponding node dynamically. So document.getElementById("course1").innerHTML="ECE517";
will change the cell content from CSC517 to ECE517. There are other methods like getElementsByTagName()
which can also be used for accessing the node.
Relation between DOM and other object models
At first, DOM looks out of place when considered to be one of the object models. However, it is based on the object concept which is different from the traditional object concept associated with object oriented languages like Java, C++. The term object model can be used in two different contexts[6].
The first one is associated with object oriented languages like Java, C++. These object models are described with inheritance, encapsulation, abstraction, etc. The second is the one in which programs can access, manipulate the specific parts of the context using a collection of classes or objects. Document Object Model fits in this concept. DOM is nothing but the collection of objects which represents the HTML/XHTML/XML document[8].
Any object model has three key concepts
- data structures that can be used to represent the object state
- behavior associated with the object state
- methods to access and operate on the object state
The name "Document Object Model" is chosen because documents are modeled using objects, and the model encompasses not only the structure of the document, but also the behavior of a document. In other words, the nodes in the DOM tree do not represent a data structure, they represent objects which have functions and properties. As an object model, the DOM identifies:
- the interfaces and the objects used to represent and manipulate a document
- the semantics of these interfaces and objects - including both behavior and attributes
- the relationships and collaborations among the interfaces and objects
Thus, DOM facilitates accessing and modifying the document using properties and behavior of object.
Other Competing Solutions
DOM is the popular parser for XML/HTML documents but there are other competing solutions like Simple API for XML(SAX), Java Architecture for XML Binding(JAXB).
Simple API for XML (SAX)
The Simple API for XML (SAX) is the event-driven mechanism which employs serial access to the elements. SAX allows processing a document as it's being read and this avoids the waiting time involved to store the document before taking any action. Being an event-based interface, the parser reports events whenever it sees a tag/attribute/text node/unresolved external entity/other. Applications using SAX receive information from XML documents in a continuous stream, with no backtracking or navigation allowed. As a result, applications need not store whole document. This makes SAX extremely efficient, allowing SAX to handle XML documents of nearly any size in linear time and near constant memory. During this process, programmer needs to use his programming skills as he has to attach event handlers to handle the events.
The applications using SAX can discard the useless portions of the document and only keep the small portion that is of interest to the application. The constant memory usage in SAX enables it to handle very large documents. Therefore, SAX is more suited for processing local information coming from nodes that are close to each other. Since SAX provides the document information to the application as a series of events, handling global context of the document is difficult. For such complex operations, building own data structure can solve the problem.
DOM vs SAX
Comparison between DOM and SAX[9].
Document Object Model (DOM) | Simple API for XML (SAX) |
---|---|
DOM is a tree based interface | SAX is an event driven interface |
Takes significant amount of memory | More memory efficient |
Convenient for random accessing | Appropriate for addressing local information |
Can read and modify the document | Can only read XML Document |
Has to wait before the entire document tree gets loaded in the memory before doing any operation | Good for streaming applications since the applications can start processing from the beginning |
Java Architecture for XML Binding (JAXB)
The Java Architecture for XML Binding (JAXB) provides a fast and convenient way of working on XML content within Java applications. It provides a unique way of binding XML data to Java representations. The major functionalities the JAXB provides are marshalling and unmarshalling. Marshalling refers to constructing Java objects from the document. Unmarshalling is the process of building up a document from Java object. With the help of these it becomes easier for Java developers to focus on the business logic rather than XML processing.
DOM vs JAXB
Comparison between DOM and JAXB parser[7]
Document Object Model (DOM) | Java Architecture for XML Binding (JAXB) |
---|---|
Transformation of DOM tree to XML | Marshalling of Java Objects to XML |
Transformation of XML Document to DOM tree | Unmarshalling of XML data to Java Objects |
Not driven by a schema and transformation is done through XML serialization process | Driven by a schema as data-binding is used for the mapping between XML Documents and Java classes |
Application needs to know XML processing | Makes application focus on business logic rather than details of XML |
Need to navigate through a tree to access data. | Allows to access data in non sequential order without requiring serial navigation |
Memory intensive as entire document tree is held in memory, making it incapable in handling very large documents. | Efficient memory usage as the tree of content objects produced through JAXB tends to be more efficient in terms of memory use than DOM-based trees. |
Used for parsing an XML document | Defines a binding between XML schema and corresponding object hierarchy |
References
- Document Object Model Introduction.
- DOM Structure
- DOM tree with document as root object
- Philippe Le Hégaret,Lauren Wood,Jonathan Robie,DOM in HTML,World Wide Web Consortium,2000
- DOM in XML
- Object Models
- Comparison of JAXB with DOM
- Frank Manola,”Technologies for Web Object Model”,IEEE Internet Computing, 1999
- Chengkai Li, “XML Parsing, SAX/DOM”,Encyclopedia of Database Systems, 2009