CSC/ECE 517 Fall 2010/ch6 6g ss: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
 
 
(97 intermediate revisions by 2 users not shown)
Line 2: Line 2:




==Introduction==


==Introduction==
Browser [http://en.wikipedia.org/wiki/Scripting_language Scripting Languages] are widely used today, one example of which is JavaScript.. It allows the user to interact with the web page by accessing the component parts of a web page. The technique used to access these components is called Document Object Model (DOM). It creates objects corresponding to the components of the web page which scripting languages interact with to manipulate web pages.
Prototype based programming is an [http://en.wikipedia.org/wiki/Object-oriented_programming object oriented programming] methodology. It differs from the other classes of object oriented languages due to its main characteristic feature that no classes are present in the language constructs. Behavior reuse or inheritance is performed by cloning some instance or prototype objects. Thus, it is also known as 'instance-based programming' or 'class-less based programming' and such languages are known as 'prototype based languages'.
===What is DOM===
Information presented in [http://en.wikipedia.org/wiki/XML XML] and [http://en.wikipedia.org/wiki/Html HTML] documents is in a  structured format. The technique used to access these documents is called Document Object Model (DOM). DOM is platform language independent interface that allows to create and modify the content, structure and style of documents. These changes can be incorporated in the document again [2][4][5]. DOM is based on the O-O concepts :


Objects used in prototype based programming are referred as ''prototypes'', which resemble an instance of a class in [http://en.wikipedia.org/wiki/Class-based_programming class-based programming]. It differs from a class instance as one can add or remove variables and methods at any level of a single object, without worrying if the object still fits in the system. Typically, new objects are created by copying existing objects, which is called as ''cloning''.
Objects are the main part while representing these objects. We can say that Objects are an encapsulation of a set of methods and interfaces. Methods are used to access or change the object’s state and interfaces refer to declaration of a set of methods.


Object Oriented programming languages are classified based on following criteria:
===What DOM isn't===
* Class based : Object instances are created fro skeleton or blueprint of the class.
1) It does not describe how to persist objects into XML or HTML, but it represents documents as a collection of objects [2].
* Prototype based : Object instances are created from concrete examples and other objects are created with such a prototype reference.
* [http://en.wikipedia.org/wiki/Automata-based_programming Automata based] : The program is thought of as a model of a finite state machine or any other formal automata.
* Based on separation of concerns e.g. aspect oriented languages


The figure below shows where prototype based languages fit in the broad classification of object oriented languages.
2) DOM does not describe any internal representation of object or data. It is not a specific set of data structures. For example, even though we have parent-child relationship between the nodes they are logical relationships of objects .
<center>
[[Image:oops.jpg]]
</center>


Since prototype based languages do not distinguish between classes and instances, there is the concept of just objects. There are three types of objects- normal objects, prototypes and traits. Prototypes and traits are not real objects but the term object is loosely used to signify their specialization feature. Normal objects are used to represent the local state. Prototypes are blueprints to generate clones. Traits, if used in the programming language, are abstract descriptions of objects that do not possess a state i.e. only provides a set of methods. Traits are never cloned or used directly and are intended only to be used as the shared parents of normal objects. [1]
3) In addition, DOM does not describe what information in a document is relevant or how that information in structured. The structuring of the document is stored by the XML Infoset. DOM is just an API for to access this information [3].


==History==
==History==
* Eleanor Rosch first introduced the 'Prototype Theory' in mid-1970's. Rosch proposed that to classify objects, one can match them against a "prototype" i.e. an ideal example which exhibits the most representative features of the category of objects. This evolved the concept of frame based languages that used the prototype theory in the field of knowledge representation languages. Here, frames were used to represent knowledge like typical values, default values, or exceptions.
Before DOM became an official W3C specification[5], the web community had started to develop ways of creating page-based script programs. This early work led to development of [http://en.wikipedia.org/wiki/Dynamic_HTML Dynamic HTML], which was primarily used for richer user interfaces than plain HTML. At the time, Netscape Navigator  was a very popular browser, and also was the first to bring out a programming language that would allow web pages to become interactive. The language, called LiveScript  was designed at Netscape Communications and came integrated into the Netscape Navigator  browser. LiveScript, now identified as JavaScript is the most widely used browser scripting language today. It has a significant role in development of present DOM. The timeline for DOM evolution is as follows:
* The concept of prototypes was later picked up in programming languages in the 1980s when Borning proposed the description of a classless language. The underlying feature of this language was that new objects were essentially being produced by copying and modifying prototypes. The classless model of object oriented programming thus brought forward several issues related to class based programming and proposed various solutions based on the use of prototypes. [1][7]


==Characteristics==
*JavaScript:[6] In December 1995, LiveScript was renamed [http://en.wikipedia.org/wiki/JavaScript JavaScript] and released as part of Netscape Navigator 2.0.
In prototype based languages, prototype is chosen by using concrete examples rather than abstracting out common attributes. This offers a programming model with fewer primitives. This is also a less restrictive knowledge representation model wherein objects are not coupled too tightly. Most popular object-oriented programming languages used are class based where all objects of a certain class share the same properties. At first the prototype paradigm is difficult to comprehend compared to class-based, such as Java and C++, which base their structure on the concept of two marked entities: classes and instances. We comparatively discuss the features of prototype based languages with class-based ones to understand its programming model and advantages.
*Jscript & vbscript :  Internet Explorer was updated to support two integrated languages, vbscript and Jscript. Jscript was very similar to JavaScript. In July 1996, Microsoft released Internet Explorer 3.0 supporting these languages.
*ECMAScript:[7] Netscape delivered JavaScript to Ecma International for standardization and the work on the specification, [http://www.ecma-international.org/publications/standards/Ecma-262.htm ECMA-262], began in November 1996.
*In June 1997, ECMA adopted a hybrid version of the scripting languages called ECMAScript. However, ECMAScript arrived late for the 4.0 releases of Netscape Navigator and Internet Explorer. Each introduced their own document object model, DHTML and dHTML, that came to be called Dynamic HTML.
*DOM: At the beginning of 1997, the companies involved in this consortium (including Netscape Communications and Microsoft) decided to find a consensus around their object models to access and manipulate documents. While trying to stay as backward compatible as possible with the original browser object models, the W3C's Document Object Model (DOM) provided a better object representation of HTML documents.
*DOM on Client:[3]  The Internet Explorer 6 and Netscape Navigator 6 browsers have achieved complete support for DOM Level 1. Other browser vendors followed them.
*DOM on Server:[3] Soon did the companies realized that the benefits of DOM can be applied on server side as well as client side. DOM implementations began appearing for server-side products as well. One well-known is the [http://xerces.apache.org/xerces-c/ Xerces]parser from Apache Foundation.
* HTML and XML:[1] In 1996, a new markup language, the Extensible Markup Language (XML), was developed in the W3C as well. Meant to remove the HTML language's extensibility restrictions, the idea of developing an object model for XML quickly became another goal of the DOM effort.
*In June 2004,[5] Ecma International published ECMA-357 standard, defining an extension to ECMAScript, known as [http://en.wikipedia.org/wiki/E4X E4X] (ECMAScript for XML).


'''Comparison between prototype-based languages and class-based languages'''
==Levels of DOM==
{| class="wikitable" border="1" style="width: 900px; margin-left:60px;"
Dom has 3 levels  each with its own specifications. Generally, developers prefer to use DOM Level2 because of its wide support by Internet Explorer (IE) and Firefox. Here is a brief description of each of these levels.
|- style="text-align: center; "
 
! Feature
===Level 1 ===
! Class Based Programming e.g. Java, C#
 
! Prototype-Based Programming e.g. JavaScript, Self
This was the first version of standards published in 1998 and later updated in 2000
|-
 
| Object Model
Consists of two major parts [3] .
| Based on the class and instance entity model i.e. structure of object is defined by classes.
 
| Every object is an instance, there are no classes.
'''DOM Core''' :  Deals with the internal structure of the nodes of the tree. Also provides, properties and methods to create, edit and manipulate the tree.
|-
 
| Object definition and creation
'''DOM HTML''' : Objects, properties and methods for specific elements associated with HTML documents and tags.
| Class defined with explicit class definition; class instantiated with help of constructors.
 
| Creates objects by assigning an object as the prototype where each prototype is associated with a constructor function.
This Level allowed basic document operations, such as creating
|-
and deleting nodes, working with their contents, and traversing the document tree.  
| Abstraction
Rest of the operations such as loading and saving documents, etc were left to the developer
| Uses abstract definitions/representation.
community. So a lot of this code was implementation dependent.
| Does not use abstract definitions.
 
|-
===Level 2 ===
| Parental Dependency
 
| The parent of an object cannot be altered
This level of DOM provides additional interfaces to the existing set of functionality of DOM1. It was first proposed in 2000 and the HTML specification was updated in 2003. It has six different specifications [1][3]
| Allows to change parent of object at runtime
 
|-
'''DOM2Core''' : Adds name-space specific methods to DOM Core and gives control of structure of DOM elements
| Inheritance
 
| Class definitions are used to define subclasses and the subclasses inherit properties by following the class chain.
'''DOM2 HTML''' : Similar to DOM HTML
| Objects inherit properties based on the hierarchy of the prototype chain.
 
|-
'''DOM2 Events''' : Control of mouse related events such as targeting, capturing,canceling. This does not handle keyboard-related events
| Dynamic Structure
| The class structure is static and contains all the properties that an instances of a class can have. One cannot add properties dynamically at run time.
| The prototype or constructor function specifies the initial set of properties and one can add or remove properties dynamically of the particular object or of the entire set of objects.
|-
|}


Thus, prototype based languages offer capabilities to represent knowledge in a way different than class based languages. Some of these representations are often difficult to represent in class based languages. Following are some examples:
'''DOM2 STYLE''' : Gives access to CSS related styling and rules. Also known as DOM2 CSS


# One can have different objects of the same family but having different structures and behaviors
'''DOM2 Traversal and Range''' : Iterative access to DOM so that navigating and manipulating the document is easy.
# Objects with totally exceptional behavior
# Objects with view points and sharing at the object level
# Incomplete state objects


Also, in the real world, entities or objects can be viewed as sets or prototypes. Sometimes modeling these architectural features in class-based languages exhibits certain limitations as they are rigid in terms of defining concepts with the idea of shared properties. Some real world examples where prototype based approach is preferred to class based one are traffic jam representation systems, greenhouse effect representation systems. [2]
'''DOM2 VIEWS''' : Ability to access and update a representation of a document.


==Classification of Prototype Based Languages==
This level added support for more types of document operations and content,
such as events and styles. As result of this, developers had standard ways to handle user events
and also work with Cascading Style Sheets (CSS) information in the documents.


It is very challenging to classify the set of prototype based languages due to the variations in implementation models for each language. However, these are needed at some level to adjudge which programming language model best suits the given needs. We classify these based on their ''primitive semantics'':
===Level 3 ===


===Slots or methods and variables===
This level was drafted in 2003 and 2004. It consists of five different specifications [3]
A property is a binding of a name to a value within an object and each object is defined by a set of properties. Properties can be of two kinds: attributes or methods. In general programming terminology one can represent these properties of objects in two ways:
1. By separating attributes and methods .
2. By amalgamating attributes and methods into something known as "slots".


Thus, slots hold data values as well as methods. Some languages differentiate between methods and variables, while others treat them in the same way. [http://en.wikipedia.org/wiki/Self_%28programming_language%29 ''Self''] treats methods and variables as same, which allows to over-ride an attribute with a method and vice versa. In prototype based languages one can add or change not only data but also methods. For this reason, most prototype-based languages refer to both data and methods as "slots". Slots are defined to answer ''messages''. Either data or code can be found in a slot as the result of sending a message. Data is just returned, and code is actually executed. If a prototype does not contain a requested slot or does not know how to answer a specific message it may delegate the request to another prototype through an inheritance relation which is discussed next. [8]
'''DOM3 Core''' : Adds more methods and changes to existing core


'''DOM3 Load  and Save''' :  Two major functions provided include
1) content of an XML document can be loaded into your DOM document
2) serialize your DOM document into an XML document


<center>
'''DOM3 Validation''' : check id dynamic document is valid and conforms to the DOCTYPE.
[[Image:classification_ms.jpg]]
</center>


===Object Creation===
'''DOM3 Events''' : handles keyboard and mouse related events.
Objects can be created in two ways:
<br/>1. Ex nihilo (from scratch)
<br/>The systems using this model provide a special syntax for specifying the properties and behaviors of the newly created objects. These do not reference the existing objects for object creation. To implement this most programming languages use the ''Object'' prototype which contains all the common attributes and behaviors. It acts like a master blueprint for all other objects.
<br/>2. From existing object (by cloning or extending)
<br/>In this approach the new object then carries all the qualities of the original. Some languages enforce child object to maintain an explicit link to its prototype and changes in the prototype cause corresponding changes to be reflected in its clone. Other systems enforce that changes in cloned objects do not automatically propagate across descendants.


===Inheritance and Sharing===
'''DOM3 XPATH''' : Using XPath features to traverse the document
Since the prototype is used to create other objects using cloning, its properties can be read through all objects of the class. These properties are actually a single shared copy. Thus, a read on these properties will retrieve the shared value from the prototype object. The cloned object can then set the value of one of these properties. This action will actually create a new property for that object. This new property can then 'shadow' or 'hide' the property from prototype object.  Clones in some languages don't get affected by change in their parent's behavior, thus help an object to not unexpectedly change through its clone.
Languages are classified on the basis of how the primitives of writing on shared property, creating new property, message passing affect the prototypes.
e.g. Javascript and Kevo both handle object sharing differently [3].


===Delegation===
This level focused on the functionality left out in Level 2 such as Load and Save, etc. Also, support for XPath provides easier navigation through the document. DOM Level 3 introduced two new data types
Delegation is a language feature that ‘delegates’ or ‘hands over’ a task to another object based on certain method lookups in the hierarchy at runtime. Thus, an object can use another object's behavior. This sounds a lot like inheritance, but inheritance allows the similar method dispatching based on the ‘type’ of object at compile time as opposed to delegation which works on instances. Thus, delegation can be used to support inheritance of methods in a prototype based system.
1) DOMObject :  Represents an arbitrary memory object
2) DOMUserData : Represents an interface to an arbitrary block of data in a DOM application.


Delegation is of two types:
It includes abstract schema modules which support Document Type Definition (DTD) files, XML
<br/>1. Value Sharing: A type of sharing between representation of different entities. It is the ability of an object to share the values of a parent when an object is made using clone.
Schemas, and other nonspecific schema types. Another feature of Level3 DOM is error handling interfaces.
<br/>2. Property Sharing: A type of sharing between viewpoints on the same entity. It is the ability of an object to share all or some of its properties with another object.


===Concatenation===
==Navigation==
Concatenation based prototype inheritance or concatenative-prototyping is another style to achieve inheritance. Instead of using traditional inheritance and delegation methods it uses incremental modification i.e. it allows duplicating existing objects and also flexibly editing them. This involves 2 operations 'new' and 'clone'
DOM represents a document using a tree structure. This tree can be thought as a collection of individual  sub trees.  
e.g. For creation of an object ColorWindow, the object creation would involve copying the existing 'Window' object and then adding new properties to the created copy using another module function say 'add'. Concatenation is achieved by late binding i.e. inherited properties that are defined earlier can be modified in an incremental fashion at run-time. There are also no links from the object to the prototype from which it is cloned. The advantage of this is that there are no side-effects of modifications on an object across other clones from the parent. The drawback is that if one needs to propagate these changes it involves more overhead. Besides, there is also memory wastage as there is no shared implementation. To avoid the issue of memory an additional field that recognizes if a parent is cloneable or not can be introduced. An example of a prototype language that uses concatenation is Kevo. [9]
Following example shows a simple HTML page and corresponding DOM.


'''Concatenation Vs Delegation'''


<div >
[[Image:domExample.jpg]]
<div style="float:left;margin-left=15%">
[[Image:Delegation.jpg]]</div>
<div style="margin-left=15%">
<table border="1px" style="border-style:solid; border-width:thin; border-width:1px;">
<tr> <td> </td>
<td> Concatenation </td>
<td> Delegation</td>
</tr>
<tr>
  <td>read Car.model</td>
    <td>Mitsubishi</td>
      <td>Mitsubishi</td>
</tr>
<tr>
  <td>read Car.maxSpeed</td>
    <td>110</td>
      <td>110</td>
</tr>
<tr>
  <td>read Car.licencePlate</td>
    <td>NC765</td>
      <td>NC765</td>
</tr>
<tr>
  <td colspan="3" align="center">write Car.maxSpeed = 75</td>
  </tr>
<tr>
  <td>read Vehicle.maxSpeed</td>
    <td>75</td>
      <td>50</td>
</tr>
<tr>
  <td>read Truck.maxSpeed</td>
    <td>90</td>
      <td>90</td>
</tr>
<tr>
  <td>read Truck.model</td>
    <td>4 Wheel Drive</td>
      <td>4 Wheel Drive</td>
</tr>
<tr>
  <td>read Truck.licencePlate</td>
    <td>NCXXX</td>
      <td>NCXXX</td>
</tr>
<tr>
  <td colspan="3" align="center">write Truck.model = Volvo</td>
  </tr>
<tr>
  <td>read Truck.model</td>
  <td>Volvo</td>
  <td>Volvo</td>
</tr>
</table>
</div>


The <BODY> branch can be thought as its own tree within the larger document tree. Each box in Figure 1 is called a node in DOM terminology. The DOM API provides several interfaces in the Core module for manipulating nodes and inserting/extracting information to and from nodes. There are also methods present for determining and changing the relationships between nodes and discovering the type of a particular mode. Lets first have a look at different object types supported by DOM.


===Primitives of virtual machine===
====Core Node Object====
The semantics of the  virtual machine primitives underlying each language also distinguishes the various types of programming languages.
A node is an object representation of a particular element in the document's context. Nodes have special names, depending on where they are located in the document tree and what their position is relative to other nodes.


Another parallel and overlapping classification of prototype based languages is based on group oriented constructions i.e. classification according to the level of abstractness of the constructions. Following are the high level groups:
All documents have a root node.
<br/>1. Purely prototype based: Self, Kevo, Taivalsaari, [http://en.wikipedia.org/wiki/Agora_%28programming_language%29 Agora], Omega, [http://en.wikipedia.org/wiki/Obliq Obliq], [http://en.wikipedia.org/wiki/NewtonScript NewtonScript]
<br/>2. Not Strictly prototype based: [http://en.wikipedia.org/wiki/Object_Lisp Object-Lisp], Yafool
<br/>3. Languages mixing prototypes and classes.


==Programming Constructs==
Each element in Figure 1 extends from Node object. There are few properties of Node object which are available to all DOM objects are extended from Node object [3][8].
Prototype based languages are varied in terms of their implementation patterns. We will consider the features of [http://en.wikipedia.org/wiki/JavaScript JavaScript], one of the most popular prototype based languages, for deeper insight into the typical implementation of programming constructs in prototyping.


===Creating Objects===
'''Types of Node Objects'''
As mentioned earlier the object property can be either methods or attributes. While creating a new object the constructor function is used which typically wraps the initial values to properties of the object.
{| class="wikitable" border="1" style=" margin-left:60px; border-style:solid;border:thin;border-spacing:0x;"
e.g.
|- style="text-align: center;border-style:solid;border:thin;border-spacing:0x; "
  function vehicle(model, licencePlate, maxSpeed) {
! Node Type
    this.model = model
! Description
    this.licencePlate = licencePlate
! nodeName
    this.maxSpeed = maxSpeed
! nodeValue
  }
! attributes
  var car1 = new car("Mitsubishi", "NC765", "135") ==> Javascript makes use of the new keyword.
|-
| Element
On creation the new object will have the properties:
| Represents an element in an HTML or XML document, which is represented by tags. In Figure 1, &lt;BODY&gt; node is an element type
  car1.model        // value="Mitsubishi
| Tag name
  car1.licencePlate // value="NC765"
| null
  car1.maxSpeed    // value="135"
| NamedNodeMap
In prototype based languages one can add own customized properties to any object[10]. For this no special declarations are needed, just choosing the name of property and accessing it takes care of the inclusion of the property. We can illustrate this using the above example. Consider we add a new property to the constructor's prototype property like shown below-
|-
| Attribute
| Represents an element's attribute. A tag with the syntax &lt;img src = "picture.jpg"&gt; has an Attribute node src.
| Attribute name
| Attribute value
| null
|-
| Text
| Represents the textual content of an element. In Figure 1, the &lt;B&gt; node has text child node which represents the text "Bold Text"
| #text
| Comment text
| null
|-
| CDATA Selection
| Represents a Character Data selection in an XML document
| #cdata-section
| CDATASection content
| null
|-
| Comment
| Represents a comment. Comments are of the form &lt;!-- commented text --&gt;
| #comment
| Comment text
| null
|-
| Document
| Represents the root node of a document
| #document
| null
| null
|-
| DocumentType
| Each Document node has  a DocumentType node that provides a list of entities defined in the document
| Name of document type
| null
| null
|-
| DocumentFragment
| They are lightweight or minimal Document nodes. They help when we want to extract just a portion of a document for processing
| #document-fragment
| null
| null
|-
| ProcessingInstruction
| Represents instructions to be used by the document processor
| Target name
| Content excluding the target
| null
|-
| Entity
| Represents an entity in XML document
| Entity name
| null
| null
|-
| EntityReference
| Represents an entity reference in the document
| Name of referenced entity
| null
| null
|-
| Notation
| Represents a notation in an XML document. Notations have no parent nodes
| Notation name
| null
| null
|-
|}
 
'''Node name, values, types'''
The above table gives a brief idea about different types of node names, their values and a short description about nodes.
 
'''Node parents, children, siblings'''
 
To easily navigate the document tree, each Node object has number of predefined properties that reference various parts of the tree. Each of these properties references an actual DOM object, with exception of childNodes which references a NodeList of DOM objects (like an array) [2][3].
*parentNode: references a single direct parent of the specific node.
*childNodes: references all child elements of a node.
*firstChild: references first child in child nodes.
*lastChild: references last child in child nodes.
*previousSiblings: Represents sibling node immediately before the selected node.
*nextSiblings: Represents sibling node immediately after the selected node.
 
'''Node Attributes'''


  car.prototype.companyOwned = true
Just like rest of DOM document, attributes are also based on the Node object, but they are not part of the general parent/child relationship tree. Attributes are instances of Attr pbject are contained in a NamedNodeMap in node's attributes property.  


This will then affect any new object that we are about to create as it automatically inherits the new property and its value. However, one can still override the value of the companyOwned property for an individual car object.
Accessing Node Attributes [2][8]:
There are different variations to this rule and different programming languages can restrict this kind of change propagation.
x=element.attributeName                   
if attributeName is a W3C defined attribute and an Attribute Node for the element, (eg id)  x gets assigned the value of that Attribute Node
if attributeName isn't a W3C defined attribute or an attribute node for the element, x gets assigned the value of the attributeName property for element (JavaScript object)


=====Cloning=====
x=element.attributes.attributeName.value  or  x= element.attributes['attributeName'].value
There are 2 levels at which an object can be cloned. These are [http://en.wikipedia.org/wiki/Object_copy ''deep cloning''] and [http://en.wikipedia.org/wiki/Object_copy ''shallow cloning'']. In shallow cloning, a copy of an object is created wherein just the references of the sub-objects are copied. In deep cloning, a complete duplicate of the original is created, i.e. it creates not only the primitive values of the original object but also copies all its sub objects as well.
if attributeName is an Attribute Node for the element, x gets assigned that node's value. If attributeName isn't an Attribute Node, produce EXCEPTION


JavaScript supports shallow cloning. The copy(), clone() functions are used to clone an object.  
x = element.attributes[indexNumber].value   
if attributeName is an Attribute Node for the element, x gets assigned that node's value. If attributeName isn't an Attribute Node, produce EXCEPTION
x= element.getAttribute('attributeName')  
if attributeName is an Attribute Node for the element, x gets assigned its value. If AttributeName isn't an Attribute Node for the element, x gets assigned the null value


i) clone() uses JavaScript's built-in prototype mechanism to create a cheap, shallow copy of a single Object.
e.g.
john2 = owl.clone(john);


ii) copy() makes a shallow, non-recursive copy of a single object.
'''Node OwnerDocument Property'''
e.g.
john4 = owl.copy(john);


One can define a deep copy algorithm to reursively copy every member explicitly.
This property can be used to reference the root document to which a node belongs.


===Delegation===
====Core Element Object====
Elements inherit from Node object. All element objects have properties and methods of the Node object as well as a few others that facilitate manipulating the attributes of node and locating child element objects.


Instead of adhering to class, subclass and inheritance schemes, similar to object oriented languages, JavaScript has prototype inheritance. Suppose the script wants to read or write a property of an object then a search with the following sequence is performed over the property name:
Methods for manipulating 'attributes' property of base node [2][3]:
# Prototype property of instance is picked if defined
*getAttribute(name) allows to retrieve an attribute based on the name of the attribute as a string.
# If no local value exists then check value of prototype property in object’s constructor
*setAttribute(name, value) allows you to set the value of an attribute based on the name of the attribute as a string.
# Continue prototype chain till a match is found of the native ''Object'' object.
*removeAttribute(name) allows to remove the value of an attribute based on the name of the attribute as a string.


The best example of delegation would be event delegation in JavaScript. Events are the main components of JavaScript. They are triggers to invoke the script. Event delegation works on the principle assumption that if an event is triggered on an element then the same event is triggered on all of the element's ancestors in the tree. Delegation is achieved by attaching a handler to the parent DOM [document object model] element. [4]
Methods for manipulating attributes bsed on actual DOM Attr node object [2][3]:
*getAttributeNode(name): Allows to retrieve the Attr node of the specified attribute.
*setAttributeNode(newAttr): Allows to set attributes based on new instances of the Attr object
*removeAttributeNode(oldAttr): Allows to remove the attribute node the same way one can remove a child node using removeChild() method.


Consider the DOM model with a window containing a form and a button inside it. The onClick() event associated with the click trigger follows the order below to delegate responsibility:
Locating Element objects within Element objects [1]:
<br/>1. Button.onClick()
*The getElementByTagName() method returns the NodeList object referencing all ancestors with the given tag name.The resulting list is the list of all the elements in the order they appear in DOM Document if read left to right and top to bottom.
<br/>2. Form.onClick()
<br/>3. Window.onClick()


The above propagation description is illustrated in the diagram below.
====Core Document Object====
<center>
The Document interface represents the entire document. IT serves as the primary gateway to accessing the document's data. It represents root of the document tree. It also contains methods necessary for creating new document objects such as elements,attributes, Text nodes, comments.  
[[Image:formButton.jpg]]
</center>
The document.documentElement property is a shortcut to the root element of the document.  
This event propagation can be explicitly stopped using the function - event.stopPropagation(). The advantage achieved here is that a single event handler is used to manage a particular type of event for the entire page. This also consumes less memory. Thus event delegation is a great technique to avoid repetition of the same event assignments for elements within a parent element. This is particularly useful in the DOM model where elements are added dynamically to the page.


===Message Passing===
Creating nodes with document methods [2][3]
One can use both function calls as well as message passing mechanism in JavaScript. The need to use message passing arises in complex JavaScript based user interfaces where one may have a single variable that refers to different kinds of objects. Each of these referenced objects can contain different functions. Thus, on every function call from the variable one has to make sure that it is calling a function that the target object implements. In such cases message passing is a more feasible solution wherein if a function is called and it is not implemented then no exception is thrown, instead a null is returned as an indication.
*createAttribute(name): Creates Attr nodes of type Node.ATTRIBUTE_NODE
*createComment(data): Crates Comment nodes of type Node.COMMENT_NODE
*crateElement(tagName): Creates element nodes of type Node.ELEMENT_NODE
*createTextNode(date): Creates Text nodes of type Node.TEXT_NODE


e.g.  
Locating Elements with Document methods [2][3]:
var newCar =  new Car();
*The getElementById('ID') method  returns one element corresponding to the ID. This method is singular and only returns one element.
var returnValue = message(newCar, ‘carColor’, 1999);
*The getElementsByTagName('TAG') method: Returns a NodeList containing all the elements in the document with the same tag name as TAG. The NodeList is ordered by the order in which the Elements were encountered in a preorder traversal of the document tree. If TAG is *, all tags are matched.
*The importNode() method: Imports a node into this document from another document. The returned node has no parent node. A copy of source node is created, thus the source document is not affected in any way.  The Documents and DocumentTypes nodes cannot be imported.


A message will be passed to newCar object instance, which then invokes the function carColor that ideally returns some color object.
==Example DOM==
The return value will be sent back and and will be assigned to returnValue just like a traditional function call. If the object doesn't implement carColor, it wont throw an exception, instead the return value will be just null. Message passing can be extended to forward messages between objects. In JavaScript this is achieved when invoking object implements a function called forwardInvocation.
The diagram [1] below shows DOM representation of a HTML page. The examples below are some of the ways in which elements can be accessed/modified using DOM [8].
<center>
[[Image:domExample2.jpg]]
</center>


e.g.
To list all the nodes that are children of the body element[3],
var classOne = new Class({
  ADS.addEvent(window, 'load', function() {
     forwardInvocation: function(){
     ADS.log.header('List child nodes of the document body')
         return objectTwo;
    for( var i=0 ; i< document..body.childNodes.length ; i++)  {
         ADS.log.write(document.body.childNodes.item(i).nodeName);
     }
     }
  });
  });


var classTwo = new Class({
To elements of 'ListItem' can be accessed using anchor's attributes property[3],:
     forwardInvocation: function(){
  ADS.addEvent(window, 'load', function() {
         return objectThree;
    ADS.log.header('Attributes')
    var anchor = document.getElementById('ListItem');
     for( var i=0 ; i< anchor.attributes.length; i++) {
         ADS.log.write(anchor.attributes.item(i) + '=' + anchor.attributes.item(i).nodeValue);
     }
     }
  });
  });


var classThree= new Class({
==Algorithms==
    receivingFunction: function(){
 
        return 'New Message!'
There are certain types of document processing algorithms [2] that you can use to traverse DOM nodes. These algorithms help to
    }
* Determine whether a node is contained within another node
  });
* Determine whether a node has a sibling of a certain type
* Finding a node based on the value of an attribute
* Processing the children of a node
 


objectOne = new classOne();
There are two main types of DOM algorithms: position based and content based.
objectTwo = new classTwo();
objectThree = new classThree();


alert(message(objectOne, 'receivingFunction')); ==> New Message!
'''Position based '''


Here, the message handler recursively forwards the message from objectOne to objectTwo until it finally reaches objectThree and the message is displayed as an alert message.
Position based DOM algorithms are best when you are more interested in position of nodes within a document than concerned with contents of nodes. For example, to determine whether a node occupies particular place in DOM tree is an application o f position based algorithm.


==Other Prototype Based Languages==
These algorithms work by examining the locations of nodes within a given DOM document. The contents of the node are secondary (if considered at all). Two common position-based algorithms are
* Determining whether a node has an ancestor node of a particular type
* Determining whether a node has a sibling of particular type.


===Self===
Self is the canonical example of prototype based languages. It was developed by David Ungar and Randall Smith as part of the Klein project, which was a virtual machine written fully in Self. Self is similar to Smalltalk in syntax and semantics. It uses prototypes unlike Smalltalk which uses class based paradigm. Objects are comprised of slots where the slot stores name and reference to the object. Interaction with object is done with message sending. Slots are viewed as ''method slots'' and ''data slots''. Method slots are used to return result when a message is received, while data slots work like class variables in class based programming. New object is created by making a copy of existing object and then adding slots to get desired behavior. Data slots can be made as parent slots. A parent slot delegates message to the objects it refers, if the selector of that message is not matched to any slots in it. Thus, a method has to be written only once. It uses the ''trait'' object such that when methods are written in a ''trait'' object, it can be used by any other object.


===Omega===
'''Content-Based Algorithms'''
Omega is a prototype based language developed by Günther Blaschek. It is a statically typed language. It uses inheritance instead of delegation by introducing types where every prototype corresponds to a type. Thus a subtype hierarchy is defined by using inheritance between prototypes. Omega characterizes into a prototype based language because prototypes can be replicated by cloning and altered by making modifications prior to its instance formation. Other key features of Omega are :
<br/>
* Single inheritance
* Conditional assignments
* Genericity i.e. providing parametrized types. E.g. List (of:Float) or List (of:Student).   
* Garbage collection
* Monomorphic types i.e. one cannot perform overloading without providing a specific type signature. The functions are thus restrictive as they work on only one type.


===Kevo===
Content-based algorithms focus on the actual content of the nodes and their attributes. Common content-based algorithms are determining whether a given node contains another particular node, retrieving nodes based upon their type and finding a node with a particular attribute value.
Kevo is another prototype based programming language created by Antero Taivalsaari for Macintosh computers. It is semantically similar to the languages Self and Omega and resembles Forth syntactically. It varies from other prototype based languages in the way inheritance and delegation are handled in it. This is mainly because Kevo objects are self-contained and do not share properties with other objects i.e. it follows the concatenation model where changes in the prototype do not propagate to the cloned object. Kevo provides the extensibility to manipulate objects flexibly by use of certain module operations. These module operations are capable of publishing updates from the prototype across a certain set of objects based on their similarity in the family tree by use of some additional primitives. Another distinct feature of Kevo is that it uses a threaded code interpreter.


===NewtonScript===
==Browser Support==
NewtonScript was created by Walter Smith for Apple's Newton MessagePad. The motivation for creation of this language was designing GUI for the Newton platform with low memory consumption. As this is an issue in class-based languages wherein an instantiation of a class causes memory being allocated for all object attributes, Walter chose to design a prototype based language wherein creating an instance would be equivalent to creating a frame that inherits from some prototype frame. Note that this class creation scheme does not allow data hiding, instance variables are still accessible from other functions not belonging to the class. It allows factoring out common behavior of frames and gives some clearly defined interface to a set of data. Applications of NewtonScript are now only restricted to mainly the Newton platform mobile and embedded devices. Despite running on a platform with only very low RAM, NewtonScript supports many modern language features such as exceptions and garbage collection. [5]


==Criticism==
DOM plays an important role in displaying the same chunk of information across multiple browsers. AS a result of this
* Although class based programming seems to be very restrictive in nature, it provides better type-safety and predictability. This is not the case with prototype based languages wherein this violation is particularly observed in case of the delegation mechanism which allows the parent to be modified during runtime. Any unintended changes will affect the parent and alter the behavior of system which might be undesirable.
the user is able to see the same information in Internet Explorer, Firefox, Opera, etc.
* Prototype based languages were developed incrementally based on specific platforms or implementation requirements. Thus, the larger community of developers is not well versed with its concepts. [11]
In addition to the standard W3C implementation of DOM, each browser has some proprietary methods and  
properties. This is the primary reason why the browsers differ in their functionality and implementation.
Internet Explorer supports ActiveXObject for handling XML HTTP requests while other browsers support the
MxlHttpRequest object. Another example of the differences might be that some versions prior ti IE6 do not support createAttribute(),
getAttributeNode(), or setAttributeNode() methods. An alternative to these is to use the getAttribute()
and setAttribute() methods. NetScape Navigator stores newline characters as individual text nodes. The developer
has to take into account these changes while target the web page across multiple browser platforms.


==Applications==
==Applications==
Most of the prototype based languages were developed as needs to certain specific applications. These applications include:
 
<br/>1. Low memory consumption based systems e.g. NewtonScript [6]
When the initial version of DOM specifications were drafted the focus was on the supporting web browsers efficiently.
<br/>2.  Vocabulary acquisition and teaching analysis
However, the evolution of the Web and the widespread use of XML has forced the use of DOM in other application as well.
<br/>3. Mental lexicon analysis
For example every application stores some amount of data on either the server or client side when it runs on the internet.
<br/>4. Cognitive linguistics and linguistic data analysis
This information is stored in a structured format and DOM techniques can be used for navigating through these objects. As a result,
<br/>5. Traffic jam representation systems
DOM finds wide applications in client-server type of business setups.
<br/>6. Greenhouse effect representation systems [2]
 
Standalone DOM implementations are not tied to larger applications such as web browsers. Instead they are provided as libraries which 
your applications can use for extracting data from the structured information. Example of this are the various parsing libraries
which can be included int your application for customized parsing of documents. On the other hand, Embedded DOM implementations are tied
to larger applications. In these type f applications, DOM is exposed to extensions, such as Macromedia’s Dreamweaver HTML
authoring applications.
 
==DOM and SAX==
[http://en.wikipedia.org/wiki/Simple_API_for_XML SAX] provides a mechanism for reading data from an XML document.
SAX is a popular alternative to the DOM. However, SAX is not the right tool for this job. Following table differentiates between SAX and DOM.
 
 
'''SAX is sequential'''
 
SAX does not allow random access to XML document. Only the current element is accessible. Suppose, if the second element is being accessed, then it cannot access information from fourth element because the fourth element hasn't been parsed yet. Also, when the fourth element is getting accessed, it cannot "look back" on the second element [9].
 
 
'''Comparing DOM and SAX'''
{| class="wikitable" border="1" style=" margin-left:60px; border-style:solid;border:thin;border-spacing:0x;"
|- style="text-align: center;border-style:solid;border:thin;border-spacing:0x; "
! DOM
! SAX
|-
| Before processing, stores the entire XML document in memory
| Sequential and parses node by node
|-
| Occupies more memory
| Doesn't store the XML in memory
|-
| We can insert or delete nodes
| We can't insert or delete a node
|-
| Traverse in any direction
| Top to bottom traversing
|-
| Accessing siblings can be done using predefined functions
| Accessing siblings is difficult due to sequential nature
|}


==Conclusion==
==Conclusion==


Thus prototype based languages propose a vision of object-oriented programming based on the notion of a prototype. Numerous prototype based languages have been designed and implemented. Most popular ones include Self, Kevo, Agora, Garnet, GlyphicScript, Moostrap, Omega, Obliq and NewtonScript. Some languages like Object-Lisp and Yafool which are not strictly prototyped but offer related mechanisms by use of prototyping at the implementation level have also been developed. [1] All of these represent another view of the real world objects not relying on advance categorization and classification, but rather on making the concepts in the problem domain as tangible and intuitive as possible. [8]
The Document Object Model (DOM) is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. It allows the programmer create and modify HTML pages and XML documents as full-fledged program objects
DOM is changing the way we browse web pages everyday. As newer version of DOM are drafted to support the modern web standards, it provides better interactivity with web pages. DOM4 promises for developing richer, standardized applications. Also with the advent of new
DOM features, the browsers will also see a change in the implementation as well as operation giving the end users a better browsing experience. XSLT Support has been provided in the latest version of DOM drafted. the initial draft of DOM level 4 was finalized in 2009. DOM4 build on DOM Core Level 3 and also incorporates features from DOM Leve2 HTML.


==Suggested Reading==
==References==
# [http://en.wikipedia.org/wiki/Prototype_pattern Prototype Design Pattern]
# [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.123.9056&rep=rep1&type=pdf Prototype Thoery, Cognitive Linguistics and Pedagogical Grammar - R.K Johnson]
# [http://delivery.acm.org/10.1145/200000/191101/p102-smith.pdf?key1=191101&key2=2362067821&coll=GUIDE&dl=GUIDE&CFID=106656954&CFTOKEN=21383009 Prototype-Based Languages: Object Lessons from Class-Free Programming - Mark Lentczner, Walter R., Smith Antero lbivalsaari, David Ungar]


==References / External Links==
# Jeffrey Sambells, Aaron Gustafson (2007), ''Advanced DOM scripting: Dynamic Web Design Techniques'', ISBN: 1-59059-856-3.
# Jeremy Keith (2005), ''DOM scripting: Web Design with JavaScript and the Document Object Model'', ISBN: 1-59059-533-5.
# Joe Marini (2002), ''The Document Object Model: Processing Structured Documents'', ISBN: 0-07-222436-3.
# [http://www.w3.org/DOM/ Document Object Model (DOM)]
# Philippe Le Hégaret, [http://www.w3.org/2002/07/26-dom-article.html The W3C Document Object Model]
# Stephen Chapman, [http://javascript.about.com/od/reference/a/history.htm A Brief History of Javascript]
# [http://en.wikipedia.org/wiki/ECMAScript ECMAScript]
# [https://developer.mozilla.org/En/DOM/Node.attributes Node Attributes]
# [https://developer.mozilla.org/en/the_dom_and_javascript The DOM and JavaScript]
# [http://docstore.mik.ua/orelly/xml/jxml/ch05_01.htm DOM and SAX]


# Marcus Arnstrom, Mikael Christiansen, Daniel Sehlberg (May 2003),  ''Prototype-based programming''.
==Further Reading==
# Borning A (1986), ''Classes versus Prototypes in Object-Oriented Languages'', IEEE Computer Society Press, [http://portal.acm.org/citation.cfm?id=324538 In Proceedings of the IEEE/ACM Fall Joint Conference].
# [http://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-84CF096 Document Object Model Core]
# [https://developer.mozilla.org/en/JavaScript/Guide/Details_of_the_Object_Model Prototype and Class based programming languages]
# [http://dev.w3.org/2006/webapi/DOM4Core/DOM4Core.html DOM Level 4]
# [http://docstore.mik.ua/orelly/web/jscript/ch07_04.html Concept of Object Prototypes]
# [http://www.ecma-international.org/publications/standards/Stnindex.htm ECMAScript Language Specification ]
# [http://www.quirksmode.org/js/introevents.html Event Delegation with JavaScript]
# [http://www.grin.com/e-book/96313/the-newton-script-programming-language The NewtonScript Programming Language]
# [http://www.directessays.com/viewpaper/78763.html Prototype Theory]
# James Noble, Antero Taivalsaari, Ivan Moore, ''Prototype Based Programming - Concepts, Languages and Applications''
# C. Dony, J. Malenfant, D. Bardon, [http://www.lirmm.fr/~dony/postscript/proto-book.pdf ''Classifying Prototype Based Languages'']
# [http://www.learn-javascript-tutorial.com/Prototype-Based-Inheritance.cfm Prototype-Based Inheritance]
# [http://en.wikipedia.org/wiki/Prototype-based_programming Criticism of prototype based programming languages]

Latest revision as of 23:18, 22 November 2010

Document Object Model


Introduction

Browser Scripting Languages are widely used today, one example of which is JavaScript.. It allows the user to interact with the web page by accessing the component parts of a web page. The technique used to access these components is called Document Object Model (DOM). It creates objects corresponding to the components of the web page which scripting languages interact with to manipulate web pages.

What is DOM

Information presented in XML and HTML documents is in a structured format. The technique used to access these documents is called Document Object Model (DOM). DOM is platform language independent interface that allows to create and modify the content, structure and style of documents. These changes can be incorporated in the document again [2][4][5]. DOM is based on the O-O concepts :

Objects are the main part while representing these objects. We can say that Objects are an encapsulation of a set of methods and interfaces. Methods are used to access or change the object’s state and interfaces refer to declaration of a set of methods.

What DOM isn't

1) It does not describe how to persist objects into XML or HTML, but it represents documents as a collection of objects [2].

2) DOM does not describe any internal representation of object or data. It is not a specific set of data structures. For example, even though we have parent-child relationship between the nodes they are logical relationships of objects .

3) In addition, DOM does not describe what information in a document is relevant or how that information in structured. The structuring of the document is stored by the XML Infoset. DOM is just an API for to access this information [3].

History

Before DOM became an official W3C specification[5], the web community had started to develop ways of creating page-based script programs. This early work led to development of Dynamic HTML, which was primarily used for richer user interfaces than plain HTML. At the time, Netscape Navigator was a very popular browser, and also was the first to bring out a programming language that would allow web pages to become interactive. The language, called LiveScript was designed at Netscape Communications and came integrated into the Netscape Navigator browser. LiveScript, now identified as JavaScript is the most widely used browser scripting language today. It has a significant role in development of present DOM. The timeline for DOM evolution is as follows:

  • JavaScript:[6] In December 1995, LiveScript was renamed JavaScript and released as part of Netscape Navigator 2.0.
  • Jscript & vbscript : Internet Explorer was updated to support two integrated languages, vbscript and Jscript. Jscript was very similar to JavaScript. In July 1996, Microsoft released Internet Explorer 3.0 supporting these languages.
  • ECMAScript:[7] Netscape delivered JavaScript to Ecma International for standardization and the work on the specification, ECMA-262, began in November 1996.
  • In June 1997, ECMA adopted a hybrid version of the scripting languages called ECMAScript. However, ECMAScript arrived late for the 4.0 releases of Netscape Navigator and Internet Explorer. Each introduced their own document object model, DHTML and dHTML, that came to be called Dynamic HTML.
  • DOM: At the beginning of 1997, the companies involved in this consortium (including Netscape Communications and Microsoft) decided to find a consensus around their object models to access and manipulate documents. While trying to stay as backward compatible as possible with the original browser object models, the W3C's Document Object Model (DOM) provided a better object representation of HTML documents.
  • DOM on Client:[3] The Internet Explorer 6 and Netscape Navigator 6 browsers have achieved complete support for DOM Level 1. Other browser vendors followed them.
  • DOM on Server:[3] Soon did the companies realized that the benefits of DOM can be applied on server side as well as client side. DOM implementations began appearing for server-side products as well. One well-known is the Xercesparser from Apache Foundation.
  • HTML and XML:[1] In 1996, a new markup language, the Extensible Markup Language (XML), was developed in the W3C as well. Meant to remove the HTML language's extensibility restrictions, the idea of developing an object model for XML quickly became another goal of the DOM effort.
  • In June 2004,[5] Ecma International published ECMA-357 standard, defining an extension to ECMAScript, known as E4X (ECMAScript for XML).

Levels of DOM

Dom has 3 levels each with its own specifications. Generally, developers prefer to use DOM Level2 because of its wide support by Internet Explorer (IE) and Firefox. Here is a brief description of each of these levels.

Level 1

This was the first version of standards published in 1998 and later updated in 2000

Consists of two major parts [3] .

DOM Core : Deals with the internal structure of the nodes of the tree. Also provides, properties and methods to create, edit and manipulate the tree.

DOM HTML : Objects, properties and methods for specific elements associated with HTML documents and tags.

This Level allowed basic document operations, such as creating and deleting nodes, working with their contents, and traversing the document tree. Rest of the operations such as loading and saving documents, etc were left to the developer community. So a lot of this code was implementation dependent.

Level 2

This level of DOM provides additional interfaces to the existing set of functionality of DOM1. It was first proposed in 2000 and the HTML specification was updated in 2003. It has six different specifications [1][3]

DOM2Core : Adds name-space specific methods to DOM Core and gives control of structure of DOM elements

DOM2 HTML : Similar to DOM HTML

DOM2 Events : Control of mouse related events such as targeting, capturing,canceling. This does not handle keyboard-related events

DOM2 STYLE : Gives access to CSS related styling and rules. Also known as DOM2 CSS

DOM2 Traversal and Range : Iterative access to DOM so that navigating and manipulating the document is easy.

DOM2 VIEWS : Ability to access and update a representation of a document.

This level added support for more types of document operations and content, such as events and styles. As result of this, developers had standard ways to handle user events and also work with Cascading Style Sheets (CSS) information in the documents.

Level 3

This level was drafted in 2003 and 2004. It consists of five different specifications [3]

DOM3 Core : Adds more methods and changes to existing core

DOM3 Load and Save : Two major functions provided include 1) content of an XML document can be loaded into your DOM document 2) serialize your DOM document into an XML document

DOM3 Validation : check id dynamic document is valid and conforms to the DOCTYPE.

DOM3 Events : handles keyboard and mouse related events.

DOM3 XPATH : Using XPath features to traverse the document

This level focused on the functionality left out in Level 2 such as Load and Save, etc. Also, support for XPath provides easier navigation through the document. DOM Level 3 introduced two new data types 1) DOMObject : Represents an arbitrary memory object 2) DOMUserData : Represents an interface to an arbitrary block of data in a DOM application.

It includes abstract schema modules which support Document Type Definition (DTD) files, XML Schemas, and other nonspecific schema types. Another feature of Level3 DOM is error handling interfaces.

Navigation

DOM represents a document using a tree structure. This tree can be thought as a collection of individual sub trees. Following example shows a simple HTML page and corresponding DOM.


The <BODY> branch can be thought as its own tree within the larger document tree. Each box in Figure 1 is called a node in DOM terminology. The DOM API provides several interfaces in the Core module for manipulating nodes and inserting/extracting information to and from nodes. There are also methods present for determining and changing the relationships between nodes and discovering the type of a particular mode. Lets first have a look at different object types supported by DOM.

Core Node Object

A node is an object representation of a particular element in the document's context. Nodes have special names, depending on where they are located in the document tree and what their position is relative to other nodes.

All documents have a root node.

Each element in Figure 1 extends from Node object. There are few properties of Node object which are available to all DOM objects are extended from Node object [3][8].

Types of Node Objects

Node Type Description nodeName nodeValue attributes
Element Represents an element in an HTML or XML document, which is represented by tags. In Figure 1, <BODY> node is an element type Tag name null NamedNodeMap
Attribute Represents an element's attribute. A tag with the syntax <img src = "picture.jpg"> has an Attribute node src. Attribute name Attribute value null
Text Represents the textual content of an element. In Figure 1, the <B> node has text child node which represents the text "Bold Text" #text Comment text null
CDATA Selection Represents a Character Data selection in an XML document #cdata-section CDATASection content null
Comment Represents a comment. Comments are of the form <!-- commented text --> #comment Comment text null
Document Represents the root node of a document #document null null
DocumentType Each Document node has a DocumentType node that provides a list of entities defined in the document Name of document type null null
DocumentFragment They are lightweight or minimal Document nodes. They help when we want to extract just a portion of a document for processing #document-fragment null null
ProcessingInstruction Represents instructions to be used by the document processor Target name Content excluding the target null
Entity Represents an entity in XML document Entity name null null
EntityReference Represents an entity reference in the document Name of referenced entity null null
Notation Represents a notation in an XML document. Notations have no parent nodes Notation name null null

Node name, values, types The above table gives a brief idea about different types of node names, their values and a short description about nodes.

Node parents, children, siblings

To easily navigate the document tree, each Node object has number of predefined properties that reference various parts of the tree. Each of these properties references an actual DOM object, with exception of childNodes which references a NodeList of DOM objects (like an array) [2][3].

  • parentNode: references a single direct parent of the specific node.
  • childNodes: references all child elements of a node.
  • firstChild: references first child in child nodes.
  • lastChild: references last child in child nodes.
  • previousSiblings: Represents sibling node immediately before the selected node.
  • nextSiblings: Represents sibling node immediately after the selected node.


Node Attributes

Just like rest of DOM document, attributes are also based on the Node object, but they are not part of the general parent/child relationship tree. Attributes are instances of Attr pbject are contained in a NamedNodeMap in node's attributes property.

Accessing Node Attributes [2][8]:

x=element.attributeName                    

if attributeName is a W3C defined attribute and an Attribute Node for the element, (eg id) x gets assigned the value of that Attribute Node if attributeName isn't a W3C defined attribute or an attribute node for the element, x gets assigned the value of the attributeName property for element (JavaScript object)

x=element.attributes.attributeName.value   or  x= element.attributes['attributeName'].value

if attributeName is an Attribute Node for the element, x gets assigned that node's value. If attributeName isn't an Attribute Node, produce EXCEPTION

x = element.attributes[indexNumber].value    

if attributeName is an Attribute Node for the element, x gets assigned that node's value. If attributeName isn't an Attribute Node, produce EXCEPTION

x= element.getAttribute('attributeName')   

if attributeName is an Attribute Node for the element, x gets assigned its value. If AttributeName isn't an Attribute Node for the element, x gets assigned the null value


Node OwnerDocument Property

This property can be used to reference the root document to which a node belongs.

Core Element Object

Elements inherit from Node object. All element objects have properties and methods of the Node object as well as a few others that facilitate manipulating the attributes of node and locating child element objects.


Methods for manipulating 'attributes' property of base node [2][3]:

  • getAttribute(name) allows to retrieve an attribute based on the name of the attribute as a string.
  • setAttribute(name, value) allows you to set the value of an attribute based on the name of the attribute as a string.
  • removeAttribute(name) allows to remove the value of an attribute based on the name of the attribute as a string.


Methods for manipulating attributes bsed on actual DOM Attr node object [2][3]:

  • getAttributeNode(name): Allows to retrieve the Attr node of the specified attribute.
  • setAttributeNode(newAttr): Allows to set attributes based on new instances of the Attr object
  • removeAttributeNode(oldAttr): Allows to remove the attribute node the same way one can remove a child node using removeChild() method.


Locating Element objects within Element objects [1]:

  • The getElementByTagName() method returns the NodeList object referencing all ancestors with the given tag name.The resulting list is the list of all the elements in the order they appear in DOM Document if read left to right and top to bottom.

Core Document Object

The Document interface represents the entire document. IT serves as the primary gateway to accessing the document's data. It represents root of the document tree. It also contains methods necessary for creating new document objects such as elements,attributes, Text nodes, comments.

The document.documentElement property is a shortcut to the root element of the document.


Creating nodes with document methods [2][3]

  • createAttribute(name): Creates Attr nodes of type Node.ATTRIBUTE_NODE
  • createComment(data): Crates Comment nodes of type Node.COMMENT_NODE
  • crateElement(tagName): Creates element nodes of type Node.ELEMENT_NODE
  • createTextNode(date): Creates Text nodes of type Node.TEXT_NODE


Locating Elements with Document methods [2][3]:

  • The getElementById('ID') method returns one element corresponding to the ID. This method is singular and only returns one element.
  • The getElementsByTagName('TAG') method: Returns a NodeList containing all the elements in the document with the same tag name as TAG. The NodeList is ordered by the order in which the Elements were encountered in a preorder traversal of the document tree. If TAG is *, all tags are matched.
  • The importNode() method: Imports a node into this document from another document. The returned node has no parent node. A copy of source node is created, thus the source document is not affected in any way. The Documents and DocumentTypes nodes cannot be imported.

Example DOM

The diagram [1] below shows DOM representation of a HTML page. The examples below are some of the ways in which elements can be accessed/modified using DOM [8].

To list all the nodes that are children of the body element[3],

 ADS.addEvent(window, 'load', function() {
   ADS.log.header('List child nodes of the document body')
   for( var i=0 ; i< document..body.childNodes.length ; i++)  {
       ADS.log.write(document.body.childNodes.item(i).nodeName);
   }
});

To elements of 'ListItem' can be accessed using anchor's attributes property[3],:

 ADS.addEvent(window, 'load', function() {
   ADS.log.header('Attributes')
   var anchor = document.getElementById('ListItem');
   for( var i=0 ; i< anchor.attributes.length; i++)  {
       ADS.log.write(anchor.attributes.item(i) + '=' + anchor.attributes.item(i).nodeValue);
   }
});

Algorithms

There are certain types of document processing algorithms [2] that you can use to traverse DOM nodes. These algorithms help to

  • Determine whether a node is contained within another node
  • Determine whether a node has a sibling of a certain type
  • Finding a node based on the value of an attribute
  • Processing the children of a node


There are two main types of DOM algorithms: position based and content based.

Position based

Position based DOM algorithms are best when you are more interested in position of nodes within a document than concerned with contents of nodes. For example, to determine whether a node occupies particular place in DOM tree is an application o f position based algorithm.

These algorithms work by examining the locations of nodes within a given DOM document. The contents of the node are secondary (if considered at all). Two common position-based algorithms are

  • Determining whether a node has an ancestor node of a particular type
  • Determining whether a node has a sibling of particular type.


Content-Based Algorithms

Content-based algorithms focus on the actual content of the nodes and their attributes. Common content-based algorithms are determining whether a given node contains another particular node, retrieving nodes based upon their type and finding a node with a particular attribute value.

Browser Support

DOM plays an important role in displaying the same chunk of information across multiple browsers. AS a result of this the user is able to see the same information in Internet Explorer, Firefox, Opera, etc. In addition to the standard W3C implementation of DOM, each browser has some proprietary methods and properties. This is the primary reason why the browsers differ in their functionality and implementation. Internet Explorer supports ActiveXObject for handling XML HTTP requests while other browsers support the MxlHttpRequest object. Another example of the differences might be that some versions prior ti IE6 do not support createAttribute(), getAttributeNode(), or setAttributeNode() methods. An alternative to these is to use the getAttribute() and setAttribute() methods. NetScape Navigator stores newline characters as individual text nodes. The developer has to take into account these changes while target the web page across multiple browser platforms.

Applications

When the initial version of DOM specifications were drafted the focus was on the supporting web browsers efficiently. However, the evolution of the Web and the widespread use of XML has forced the use of DOM in other application as well. For example every application stores some amount of data on either the server or client side when it runs on the internet. This information is stored in a structured format and DOM techniques can be used for navigating through these objects. As a result, DOM finds wide applications in client-server type of business setups.

Standalone DOM implementations are not tied to larger applications such as web browsers. Instead they are provided as libraries which your applications can use for extracting data from the structured information. Example of this are the various parsing libraries which can be included int your application for customized parsing of documents. On the other hand, Embedded DOM implementations are tied to larger applications. In these type f applications, DOM is exposed to extensions, such as Macromedia’s Dreamweaver HTML authoring applications.

DOM and SAX

SAX provides a mechanism for reading data from an XML document. SAX is a popular alternative to the DOM. However, SAX is not the right tool for this job. Following table differentiates between SAX and DOM.


SAX is sequential

SAX does not allow random access to XML document. Only the current element is accessible. Suppose, if the second element is being accessed, then it cannot access information from fourth element because the fourth element hasn't been parsed yet. Also, when the fourth element is getting accessed, it cannot "look back" on the second element [9].


Comparing DOM and SAX

DOM SAX
Before processing, stores the entire XML document in memory Sequential and parses node by node
Occupies more memory Doesn't store the XML in memory
We can insert or delete nodes We can't insert or delete a node
Traverse in any direction Top to bottom traversing
Accessing siblings can be done using predefined functions Accessing siblings is difficult due to sequential nature

Conclusion

The Document Object Model (DOM) is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. It allows the programmer create and modify HTML pages and XML documents as full-fledged program objects DOM is changing the way we browse web pages everyday. As newer version of DOM are drafted to support the modern web standards, it provides better interactivity with web pages. DOM4 promises for developing richer, standardized applications. Also with the advent of new DOM features, the browsers will also see a change in the implementation as well as operation giving the end users a better browsing experience. XSLT Support has been provided in the latest version of DOM drafted. the initial draft of DOM level 4 was finalized in 2009. DOM4 build on DOM Core Level 3 and also incorporates features from DOM Leve2 HTML.

References

  1. Jeffrey Sambells, Aaron Gustafson (2007), Advanced DOM scripting: Dynamic Web Design Techniques, ISBN: 1-59059-856-3.
  2. Jeremy Keith (2005), DOM scripting: Web Design with JavaScript and the Document Object Model, ISBN: 1-59059-533-5.
  3. Joe Marini (2002), The Document Object Model: Processing Structured Documents, ISBN: 0-07-222436-3.
  4. Document Object Model (DOM)
  5. Philippe Le Hégaret, The W3C Document Object Model
  6. Stephen Chapman, A Brief History of Javascript
  7. ECMAScript
  8. Node Attributes
  9. The DOM and JavaScript
  10. DOM and SAX

Further Reading

  1. Document Object Model Core
  2. DOM Level 4
  3. ECMAScript Language Specification