|
Design
The Oracle XML Parsers
support the DOM (Document Object Model) and SAX (Simple API for XML) interfaces.
This tutorial uses the SAX API to normalize an XML document. Here's why.
XML APIs generally
fall into two categories: tree-based and event-based. A tree-based API (such
as DOM) builds an in-memory tree representation of the XML document. It provides
classes and methods for an application to navigate and process the tree. In
general, the DOM interface is most useful for structural manipulations of the
XML tree, such as reordering elements, adding or deleting elements and attributes,
renaming elements, and so on.
An event-based
API (such as SAX) uses callbacks to report parsing events to the application.
The application deals with these events through customized event handlers. Events
include the start and end of elements and characters. Event-based APIs usually
do not build in-memory tree representations of the XML documents. Therefore,
in general, SAX is useful for applications that do not need to manipulate the
XML tree, such as search operations, among others.
The following figure shows an XML document
and corresponding SAX and DOM representations.
| XML Document |
SAX Events |
DOM Tree |
<?xml version="1.0"?>
<EMPLIST>
<EMP>
<ENAME>MARTIN</ENAME>
</EMP>
<EMP>
<ENAME>SCOTT</ENAME>
</EMP>
</EMPLIST>
|
|
start document
start element: EMPLIST
start element: EMP
start element: ENAME
characters: MARTIN
end element: EMP
start element: EMP
start element: ENAME
characters: SCOTT
end element: EMP
end element: EMPLIST
end document
|
|
 |
So, because normalization involves processing document elements
to remove whitespace characters, this tutorial uses SAX instead of DOM.
|