CSC/ECE 517 Fall 2010/ch6 6g ss: Difference between revisions
Line 232: | Line 232: | ||
1) W3C Document Object - http://www.w3.org/DOM/ | 1) W3C Document Object - http://www.w3.org/DOM/ | ||
Revision as of 19:52, 17 November 2010
Document Object Model
Introduction
What is DOM
Information is presented in XML and HTML documents is in a structured format. The technique used to access these document si called Document Object Model (DOM). Dom is platform language independent interface that allows to create and modify the content,structure and style of documents. These changes can be incorporated in the document again. DOM is based on the O-O concepts :
Objects are the main part while representing these objects. We can say that Objects are encapsulation of a set of methods and interfaces. Methods are used to access or change the object’s state and interfaces refer to declaration of a set of methods
What DOM isn't
1) It does not describe how to persist objects into XML or HTML, but it represents documents as a collection of objects.
2) DOM does not describt any internal representation of object or data. It is not a specific set of data structures. For example, even though we have parent-child relationship between the nodes they are logocal relationships of objects.
3) In addition, DOM does not describe what information in a document is relevant or how that information in structured. The structuring of the document is stored by the XML Infoset. DOM is just an API for to access this information.
History
- Before DOM became an official W3C specification, the web community the web community had started to develop ways of creating page-based script programs. This early work led to development of Dynamic HTML <refer this>, which was primarily used for richer user interfaces than plain HTML.
- LiveScript: At the time, Netscape Navigator was a very popular browser, and also was the first to bring out a programming language that would allow web pages to become interactive. The language, called LiveScript <refer> was designed at Netscape Communications and came integrated into the Netscape Navigator browser.
- JavaScript: In December 1995, LiveScript was renamed JavaScript and released as part of Netscape Navigator 2.0.
- Jscript & vbscript : Internet Explorer was updated to support two integrated languages, vbscript and Jscript. Jscript was very similar to JavaScript. In July 1996, Microsoft released Internet Explorer 3.0 supporting these languages.
- ECMAScript: Netscape delivered JavaScript to Ecma International for standardization and the work on the specification, ECMA-262, began in November 1996.
- In June 1997, ECMA adopted a hybrid version of the scripting languages called ECMAScript. However, ECMAScript arrived late for the 4.0 releases of Netscape Navigator and Internet Explorer. Each introduced their own document object model, DHTML and dHTML, that came to be called Dynamic HTML.
- DOM: At the beginning of 1997, the companies involved in this consortium (including Netscape Communications and Microsoft) decided to find a consensus around their object models to access and manipulate documents. While trying to stay as backward compatible as possible with the original browser object models, the W3C's Document Object Model (DOM) provided a better object representation of HTML documents.
- DOM on Client: <textbook> The Internet Explorer 6 and Netscape Navigator 6 browsers have achieved complete support for DOM Level 1. Other browser vendors followed them.
- DOM on Server: Soon did the companies realized that the benefits of DOM can be applied on server side as well as client side. DOM implementations began appearing for server-side products as well. One well-known is the Xerces <http://xerces.apache.org/xerces-c/>parser from Apache Foundation.
- HTML and XML: In 1996, a new markup language, the Extensible Markup Language (XML), was developed in the W3C as well. Meant to remove the HTML language's extensibility restrictions, the idea of developing an object model for XML quickly became another goal of the DOM effort.
- In June 2004, Ecma International published ECMA-357 standard, defining an extension to ECMAScript, known as E4X (ECMAScript for XML).
Levels of DOM
Dom has 3 levels each with its own specifications. Generally, developers prefer to use DOM Level2 because of its wide support by Internet Explorer (IE) and Firefox. Here is a brief description of each of these levels.
Level 1
This was the first version of standards published in 1998 and later updated in 2000
Consists of two major parts .
DOM Core : Deals with the internal structure of the nodes of the tree. Also provides, properties and methods to create, edit and manipulate the tree.
DOM HTML : Objects, properties and methods for specific elements associated with HTML documents and tags.
Level 2
This level of DOM provides additional interfaces to the existing set of functionality of DOM1. It was first proposed in 2000 and the HTML specification was updated in 2003. It has six different specifications
DOM2Core : Adds name-space specific methods to DOM Core and gives control of structure of DOM elements
DOM2 HTML : Similar to DOM HTML
DOM2 Events : Control of mouse related events such as targeting, capturing,canceling. This does not handle keyboard-related events
DOM2 STYLE : Gives access to CSS related styling and rules. Also known as DOM2 CSS
DOM2 Traversal and Range : Iterative access to DOM so that navigating and manipulating the document is easy.
DOM2 VIEWS : Ability to access and update a representation of a document.
Level 3
This level was drafted in 2003 and 2004. It consists of five different specifications
DOM3 Core : Adds more methods and changes to existing core
DOM3 Load and Save : Two major functionalities provided of this include 1) content of an XML document can be loaded into your DOM document 2) serialize your DOM document into an XML document
DOM3 Validation : check id dynamic document is valid and conforms to the DOCTYPE.
DOM3 Events : handles keyboard and mouse related events.
DOM3 XPATH : Using XPath features to traverse the document
(I will format this afterwards.. there is a table of around 20 lines to be added) DOM represents a document using a tree structure. This tree can be thought as a collection of individual sub trees. image here
Figure <x>.1 shows sample HTML document. Figure <x>.2 shows its Document tree diagram.
The <BODY> branch can be thought as its own tree within the larger document tree. Each box in <X>.2 is called a node in DOM terminology. The DOM API provides several interfaces in the Core module for manipulating nodes and inserting/extracting information to and from nodes. There are also methods present for determining and changing the relationships between nodes and discovering the type of a particular mode. Lets first have a look at different object types supported by DOM.
DOM Core Node Object A node is an object representation of a particular element in the document's context. Nodes have special names, depending on where they are located in the document tree and what their position is relative to other nodes.
All documents have a root node.
Each element in Figure <X>.2 extends from Node object. There are few properties of Node object which are available to all DOM objects are extended from Node object.
Types of Node Objects
Node Type | Description |
---|---|
Element | Represents an element in an HTML or XML document, which is represented by tags. In Figure <X>.2, <BODY> node is an element type |
Attribute | Represents an element's attribute. A tag with the syntax <img src = "picture.jpg"> has an Attribute node src. |
Text | Represents the textual content of an element. In Figure <X>.2, the node has text child node which represents the text "Bold Text" |
CDATA Selection | Represents a Character Data selection in an XML document |
Comment | Represents a comment. Comments are of the form |
Document | Represents the root node of a document |
DocumentType | Each Document node has a DocumentType node that provides a list of entities defined in the document |
DocumentFragment | They are lightweight or minimal Document nodes. They help when we want to extract just a portion of a document for processing |
ProcessingInstruction | Represents instructions to be used by the document processor |
Entity | Represents an entity in XML document |
EntityReference | Represents an entity reference in the document |
Notation | Represents a notation in an XML document. Notations have no parent nodes |
- Node name, values, types - Node parents, children, siblings To easily navigate the document tree, each Node object has number of predefined properties that reference various parts of the tree. Each of these properties references an actual DOM object, with exception of childNOdes which references a NodeList of DOM objects (like an array). parentNode: references a single direct parent of the specific node. childNodes: references all child elements of a node. firstChild: references first child in child nodes. lastChild: references last child in child nodes. previousSiblings: Represents sibling node immediately before the selected node. nextSiblings: Represents sibling node immediately after the selected node.
- Node Attributes <https://developer.mozilla.org/En/DOM/Node.attributes> Just like rest of DOM document, attributes are also based on the Node object, but they are not part of the general parent/child relationship tree. Attributes are instances of Attr pbject are contained in a NamedNodeMap in node's attributes property. Accessing Node Attributes:
x=element.attributeName if attributeName is a W3C defined attribute and an Attribute Node for the element, (eg id) x gets assigned the value of that Attribute Node if attributeName isn't a W3C defined attribute or an attribute node for the element, x gets assigned the value of the attributeName property for element (JavaScript object)
x=element.attributes.attributeName.value or x= element.attributes['attributeName'].value if attributeName is an Attribute Node for the element, x gets assigned that node's value. If attributeName isn't an Attribute Node, produce EXCEPTION
x = element.attributes[indexNumber].value if attributeName is an Attribute Node for the element, x gets assigned that node's value. If attributeName isn't an Attribute Node, produce EXCEPTION
x= element.getAttribute('attributeName') if attributeName is an Attribute Node for the element, x gets assigned its value. If AttributeName isn't an Attribute Node for the element, x gets assigned the null value
- Node OwnerDocument Property This property can be used to reference the root document to which a node belongs.
Core Element Object Elements inherit from Node object. All element objects have properties and methods of the Node object as well as a few others that facilitate manipulating the attributes of node and locating child element objects.
Methods for manipulating 'attributes' property of base node: getAttribute(name) allows to retrieve an attribute based on the name of the attribute as a string. setAttribute(name, value) allows you to set the value of an attribute based on the name of the attribute as a string. removeAttribute(name) allows to remove the value of an attribute based on the name of the attribute as a string.
Methods for manipulating attributes bsed on actual DOM Attr node object: getAttributeNode(name): Allows to retrieve the Attr node of the specified attribute. setAttributeNode(newAttr): Allows to set attributes based on new instances of the Attr object removeAttributeNode(oldAttr): Allows to remove the attribute node the same way one can remove a child node using removeChild() method.
Locating Element objects within Element objects: The getElementByTagName() method returns the NodeList object referencing all ancestors with the given tag name.The resulting list is the list of all the elements in the order they appear in DOM Document if read left to right and top to bottom.
Core Document Object
The Document interface represents the entire document. IT serves as the primary gateway to accessing the document's data. It represents root of the document tree. It also contains methods necessary for creating new document objects such as elements, sattributes, Text nodes, comments.
The document.documentElement property It is a shortcut to the root element fo the document.
Creating nodes with document methods createAttribute(name): Creates Attr nodes of type Node.ATTRIBUTE_NODE createComment(data): Crates Comment nodes of type Node.COMMENT_NODE crateElement(tagName): Creates element nodes of type Node.ELEMENT_NODE createTextNode(date): Creates Text nodes of type Node.TEXT_NODE
Locating Elements with Document methods: The getElementById('ID') method returns one element corresponding to the ID. This method is singular and only returns one element.
The getElementsByTagName('TAG') method returns a NodeList containing all the elements in the document with the same tag name as TAG. The NodeList is ordered by the order in which the Elements were encountered in a preorder traversal of the document tree. If TAG is *, all tags are matched.
The importNode() method imports a node into this document from another document. The returned node has no parent node. A copy of source node is created, thus the source document is not affected in any way. The Documents and DocumentTypes nodes cannot be imported.
Example DOM
Algorithms
Browser Support
Applications
Future DOM
Conclusion
Suggested Reading
1) W3C Document Object - http://www.w3.org/DOM/