Developer: XML
by Rahul Srivastava
Moving to XML Schema? This introduction to namespaces will help you understand one of its more important components.
Other articles in this series:
XML Schema: Understanding Datatypes
XML Schema: Understanding Structures
Downloads for this article:
Oracle JDeveloper 10g (includes visual XML Schema editor)
As defined by the W3C Namespaces in XML Recommendation , an XML namespace is a collection of XML elements and attributes identified by an Internationalized Resource Identifier (IRI); this collection is often referred to as an XML "vocabulary."
One of the primary motivations for defining an XML namespace is to avoid naming conflicts when using and re-using multiple vocabularies. XML Schema is used to create a vocabulary for an XML instance, and uses namespaces heavily. Thus, having a sound grasp of the namespace concept is essential for understanding XML Schema and instance validation overall.
Namespaces are similar to packages in Java in several ways:
Thus, we see that the namespaces in XML concept is not very different from packages in Java. This correlation is intended to simplify the understanding of namespaces in XML and to help you visualize the namespaces concept.
In this article, you will learn:
Namespaces are declared as an attribute of an element. It is not mandatory to declare namespaces only at the root element; rather it could be declared at any element in the XML document. The scope of a declared namespace begins at the element where it is declared and applies to the entire content of that element, unless overridden by another namespace declaration with the same prefix namewhere, the content of an element is the content between the <opening-tag> and </closing-tag> of that element. A namespace is declared as follows:
<someElement xmlns:pfx="http://www.foo.com" />
In the attribute xmlns:pfx, xmlns
is like a reserved word, which is used only to declare a namespace. In other words, xmlns is used for binding namespaces, and is not itself bound to any namespace. Therefore, the above example is read as binding the prefix "pfx" with the namespace "http://www.foo.com."
It is a convention to use XSD or XS as a prefix for the XML Schema namespace, but that decision is purely personal. One can choose to use a prefix ABC for the XML Schema namespace, which is legal, but doesn't make much sense. Using meaningful namespace prefixes add clarity to the XML document. Note that the prefixes are used only as a placeholder and must be expanded by the namespace-aware XML parser to use the actual namespace bound to the prefix. In Java analogy, a namespace binding can be correlated to declaring a variable, and wherever the variable is referenced, it is replaced by the value it was assigned.
In our previous namespace declaration example, wherever the prefix "pfx" is referenced within the namespace declaration scope, it is expanded to the actual namespace( http://www.foo.com)
to which it was bound:
In Java: String pfx = "http://www.library.com"
In XML: <someElement xmlns:pfx="http://www.foo.com" />
Although a namespace usually looks like a URL, that doesn't mean that one must be connected to the Internet to actually declare and use namespaces. Rather, the namespace is intended to serve as a virtual "container" for vocabulary and un-displayed content that can be shared in the Internet space. In the Internet space URLs are uniquehence you would usually choose to use URLs to uniquely identify namespaces. Typing the namespace URL in a browser doesn't mean it would show all the elements and attributes in that namespace; it's just a concept.
But here's a twist: although the W3C Namespaces in XML Recommendation declares that the namespace name should be an IRI, it enforces no such constraint. Therefore, I could also use something like:
<someElement xmlns:pfx=" foo" />
which is perfectly legal.
By now it should be clear that to use a namespace, we first bind it with a prefix and then use that prefix wherever required. But why can't we use the namespaces to qualify the elements or attributes from the start? First, because namespacesbeing IRIsare quite long and thus would hopelessly clutter the XML document. Second and most important, because it might have a severe impact on the syntax, or to be specific, on the production rules of XMLthe reason being that an IRI might have characters that are not allowed in XML tags per the W3C XML 1.0 Recommendation .
Invalid) <http://www.library.com:Book /> Valid) <lib:Book xmlns:lib="http://www.library.com" />
Below the elements Title and Author are associated with the Namespace http://www.library.com:
<?xml version="1.0"?> <Book xmlns:lib="http://www.library.com"> <lib:Title>Sherlock Holmes</lib:Title> <lib:Author>Arthur Conan Doyle</lib:Author> </Book>
In the example below, the elements Title
and Author
of Sherlock Holmes - III
and Sherlock Holmes - I
are associated with the namespace http://www.library.com
and the elements Title and Author of Sherlock Holmes - II
are associated with the namespace http://www.otherlibrary.com.
<?xml version="1.0"?> <Book xmlns:lib="http://www.library.com"> <lib:Title>Sherlock Holmes - I</lib:Title> <lib:Author>Arthur Conan Doyle</lib:Author> <purchase xmlns:lib="http://www.otherlibrary.com"> <lib:Title>Sherlock Holmes - II</lib:Title> <lib:Author>Arthur Conan Doyle</lib:Author> </purchase> <lib:Title>Sherlock Holmes - III</lib:Title> <lib:Author>Arthur Conan Doyle</lib:Author> </Book>
The W3C Namespaces in XML Recommendation enforces some namespace constraints:
The following violates both these constraints:
<?xml version="1.0"?> <Book xmlns:XmlLibrary="http://www.library.com"> <lib:Title>Sherlock Holmes - I</lib:Title> <lib:Author>Arthur Conan Doyle</lib:Author> </Book>
[Error]: prefix lib not bound to a namespace.
[Inadvisable]: prefix XmlLibrary begins with 'Xml.'
Default Namespace (Not Default Namespaces)
It would be painful to repeatedly qualify an element or attribute you wish to use from a namespace. In such cases, you can declare a {default namespace} instead. Remember, at any point in time, there can be only one {default namespace} in existence. Therefore, the term "Default Namespaces" is inherently incorrect.
Declaring a {default namespace} means that any element within the scope of the {default namespace} declaration will be qualified implicitly, if it is not already qualified explicitly using a prefix. As with prefixed namespaces, a {default namespace} can be overridden too. A {default namespace} is declared as follows:
<someElement xmlns="http://www.foo.com"/> <?xml version="1.0"?> <Book xmlns="http://www.library.com"> <Title>Sherlock Holmes</Title> <Author>Arthur Conan Doyle</Author> </Book>
In this case the elements Book,
Title,
and Author
are associated with the Namespace http://www.library.com.
Remember, the scope of a namespace begins at the element where it is declared. Therefore, the element Book is also associated with the {default namespace}, as it has no prefix.
<?xml version="1.0"?> <Book xmlns="http://www.library.com"> <Title>Sherlock Holmes - I</Title> <Author>Arthur Conan Doyle</Author> <purchase xmlns="http://www.otherlibrary.com"> <Title>Sherlock Holmes - II</Title> <Author>Arthur Conan Doyle</Author> </purchase> <Title>Sherlock Holmes - III</Title> <Author>Arthur Conan Doyle</Author> </Book>
In the above, the elements Book,
and Title,
and
Author of Sherlock Holmes - III
and Sherlock Holmes - I
are associated with the namespace http://www.library.com
and the elements purchase,Title,
and Author
of Sherlock Holmes - II
are associated with the namespacehttp://www.otherlibrary.com.
Default Namespace and Attributes
Default namespaces do not apply to attributes; therefore, to apply a namespace to an attribute the attribute must be explicitly qualified. Here the attribute isbn
has {no namespace} whereas the attribute cover is associated with the namespacehttp://www.library.com.
<?xml version="1.0"?> <Book isbn="1234" pfx:cover="hard" xmlns="http://www.library.com" xmlns:pfx="http://www.library.com"> <Title>Sherlock Holmes</Title> <Author>Arthur Conan Doyle</Author> </Book>
Unbinding an already-bound prefix is not allowed per the W3C Namespaces in XML 1.0 Recommendation, but is allowed per W3C Namespaces in XML 1.1 Recommendation. There was no reason why this should not have been allowed in 1.0, but the mistake has been rectified in 1.1. It is necessary to know this difference because not many XML parsers yet support Namespaces in XML 1.1.
Although there were some differences in unbinding prefixed namespaces, both versions allow you to unbind or remove the already declared {default namespace} by overriding it with another {default namespace} declaration, where the namespace in the overriding declaration is empty. Unbinding a namespace is as good as the namespace not being declared at all. Here the elements Book,
Title,
and Author
of Sherlock Holmes - III
and Sherlock Holmes - I
are associated with the namespacehttp://www.library.com
and the elements purchase, Title,
and Author
of Sherlock Holmes - II
have {no namespace}:
<someElement xmlns ="" /> <?xml version="1.0"?> <Book xmlns="http://www.library.com"> <Title>Sherlock Holmes - I</Title> <Author>Arthur Conan Doyle</Author> <purchase xmlns=""> <Title>Sherlock Holmes - II</Title> <Author>Arthur Conan Doyle</Author> </purchase> <Title>Sherlock Holmes - III</Title> <Author>Arthur Conan Doyle</Author> </Book>
Here's an invalid example of unbinding a prefix per Namespaces in XML 1.0 spec, but a valid example per Namespaces in XML 1.1:
<purchase xmlns:lib="">
From this point on, the prefix lib cannot be used in the XML document because it is now undeclared as long as you are in the scope of element purchase. Of course, you can definitely re-declare it.
No namespace exists when there is no default namespace in scope. A {default namespace} is one that is declared explicitly using xmlns. When a {default namespace} has not been declared at all using xmlns, it is incorrect to say that the elements are in {default namespace}. In such cases, we say that the elements are in {no namespace}. {no namespace} also applies when an already declared {default namespace} is undeclared.
In summary:
Thus far we have seen how to declare and use an existing namespace. Now let's examine how to create a new namespace and add elements and attributes to it using XML Schema.
XML Schema is an XML before it's anything else. In other words, like any other XML document, XML Schema is built with elements and attributes. This "building material" must come from the namespace http://www.w3.org/2001/XMLSchema,
which is a declared and reserved namespace that contains elements and attributes as defined in W3C XML Schema Structures Specification and W3C XML Schema Datatypes Specification . You should not add elements or attributes to this namespace.
Using these building blocks we can create new elements and attributes as required and enforce the required constraints on these elements and attributes an d keep them in some namespace. (See Figure 1 ) XML Schema calls this particular namespace as the {target namespace}, or the namespace where the newly created elements and attributes will reside.
Figure 1: Elements and attributes in XML Schema namespace are used to write an XML Schema document, which generates elements and attributes as defined by user and puts them in {target namespace}. This {target namespace} is then used to validate the XML instance.
This {target namespace} is referred from the XML instance for ensuring validity of the instance document. (See Figure 2 .) During validation, the Validator verifies that the elements/attributes used in the instance exist in the declared namespace, and also checks for any other constraint on their structure and datatype.
Figure 2: From XML Schema to XML Schema instance
In XML Schema we can choose to specify whether the instance document must qualify all the elements and attributes, or must qualify only the globally declared elements and attributes. Regardless of what we choose, the entire instance would be validated. So why do we have two choices?
The answer is "manageability." When we choose qualified , we are specifying that all the elements and attributes in the instance must have a namespace, which in turn adds namespace complexity to instance. If say that the schema is modified by making some local declarations global and/or making some global declarations local, then the instance documents are not affected at all. In contrast, when we choose unqualified , we are specifying that only the globally declared elements and attributes in the instance must have a namespace, which in turn hides the namespace complexity from the instance. But in this case, if say, the schema is modified by making some local declarations global and/or making some global declarations local, then all instance documents are affectedand the instance is no longer valid. The XML Schema Validator would report validation errors if we try to validate this instance against the modified XML Schema. Therefore, the namespaces must be fixed in the instance per the modification done in XML Schema to make the instance valid again.
<?xml version="1.0" encoding="US-ASCII"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.library.com" targetNamespace="http://www.library.com" elementFormDefault="qualified" attributeFormDefault="unqualified"> <element name="Book" type="tns:BookType" /> <complexType name="BookType"> <sequence> <element name="Title" type="string" /> <element name="Author" type="string" /> </sequence> </complexType> </schema>
The declarations that are the immediate children of the element <schema> are the global declarations, and the rest are local declarations. In the above example, Book and BookType are declared globally whereas Title and Author are local declarations.
We can express the choice between qualified and unqualified by setting the schema element attributes elementFormDefault and attributeFormDefault
to either qualified or unqualified.
elementFormDefault = ( qualified | unqualified) : unqualified attributeFormDefault = ( qualified | unqualified) : unqualified
When elementFormDefault is set to qualified , it implies that in the instance of this grammar all the elements must be explicitly qualified, either by using a prefix or setting a {default namespace}. An unqualified setting means that only the globally declared elements must be explicitly qualified, and the locally declared elements must not be qualified. Qualifying a local declaration in this case is an error. Similarly, when attributeFormDefault is set to qualified , all attributes in the instance document must be explicitly qualified using a prefix.
Remember, {default namespace} doesn't apply to attributes; hence, we can't use a {default namespace} declaration to qualify attributes.Unqualified seems to imply being in the namespace by virtue of the containing element. This is interesting, isn't it?
In the following diagrams, the concept symbol space is similar to the non-normative concept of namespace partition. For example, if a namespace is like a refrigerator, then the symbol spaces are the shelves in the refrigerator. Just as shelves partition the entire space in a refrigerator, the symbol spaces partition the namespace.
There are three primary partitions in a namespace: one for global element declarations, one for global attribute declarations, and one for global type declarations (complexType/simpleType). This arrangement implies we can have a global element, a global attribute, and a global type all have the same name, and still co-exist in a {target namespace} without any name collisions. Further, every global element and a global complexType have their own symbol space to contain the local declarations.
Let's examine the four possible combinations of values for the pair of attributes elementFormDefault and attributeFormDefault.
Case 1: elementFormDefault=qualified, attributeFormDefault=qualified
Here the {target namespace} directly contains all the elements and attributes; therefore, in the instance, all the elements and attributes must be qualified.
Case 2: elementFormDefault=qualified, attributeFormDefault=unqualified
Here the {target namespace} directly contains all the elements and the corresponding attributes for these elements are contained in the symbol space of the respective elements. Therefore, in the instance, only the elements must be qualified and the attributes must not be qualified, unless the attribute is declared globally.
Case 3: elementFormDefault=unqualified, attributeFormDefault=qualified
Here the {target namespace} directly contains all the attributes and only the globally declared elements, which in turn contains its child elements in its symbol space. Therefore, in the instance, only the globally declared elements and all the attributes must be qualified.
Case 4: elementFormDefault=unqualified, attributeFormDefault=unqualified
Here the {target namespace} directly contains only the globally declared elements, which in turn contains its child elements in its symbol space. Every element contains the corresponding attributes in its symbol space; therefore, in the instance, only the globally declared elements and attributes must be qualified.
The above diagrams are intended as a visual representation of what is directly contained in a namespace and what is transitively contained in a namespace, depending on the value of elementFormDefault/ attributeFormDefault
. The implication of this setting is that the elements/attributes directly in the {target namespace}
must have a namespace associated with them in the corresponding XML instance, and the elements/attributes that are not directly (transitively) in the {target namespace}
must not have a namespace associated with them in the corresponding XML instance.
Now we know that XML Schema creates the new elements and attributes and puts it in a namespace called {target namespace}
. But what if we don't specify a {target namespace}
in the schema? When we don't specify the attribute targetNamespace at all, no {target namespace} existswhich is legalbut specifying an empty URI in the targetNamespace attribute is "illegal."
For example, the following is invalid. We can't specify an empty URI for the {target namespace}:
<schema targetNamespace="" . . .>
In this case, when no {target namespace} exists, we say, as described earlier, that the newly created elements and attributes are kept in {no namespace}. (It would have been incorrect to use the term {default namespace}.) To validate the corresponding XML instance, the corresponding XML instance must use the noNamespaceSchemaLocation
attribute from the http://www.w3.org/2001/XMLSchema-instance
namespace to refer to the XML Schema with no target namespace.
Hopefully, this overview of namespaces should help you move to XML Schema more easily. The Oracle XML Developer Kit (XDK) supports the W3C Namespaces in the XML 1.0 Recommendation; you can turn on/off the namespace check using the JAXP APIs in the Oracle XDK by using the setNamespaceAware(boolean)
method in the SAXParserFactory
and the DocumentBuilderFactory
classes.
Use the following resources to test the examples and to learn more about namespaces and XML Schema.
Download Oracle 10g XDK Production . Oracle XDK is a set of components, tools, and utilities that eases the task of building and deploying XML-enabled applications. Unlike many shareware and trial XML components, the production Oracle XDK are fully supported and come with a commercial redistribution license.
Read the W3C XML Schema Primer This document provides an easily readable description of the XML Schema facilities, and is oriented toward quickly understanding how to create schemas using the XML Schema language. XML Schema Part 1: Structuresand XML Schema Part 2: Datatypes provide the complete normative description of the XML Schema language.
Bookmark the XML Technology Center Whether you are a beginner, intermediate, or advanced XML user, the XML Center provides you up-to-date content and guidance to develop all types of XML and Web Service applications.