XMLBeans 2.0 - A Java Developer's Perspective
Pages: 1, 2, 3, 4

New Features in XMLBeans 2.0

Often, it can be easier to get a feel for new features in a product by seeing them in action. We'd like to relate some of the new features of XMLBeans to you by talking about one of our own projects that makes use of some of these great features. As you may already know, because XMLBeans is an Apache project, it tracks bugs, features, and other issues using Atlassian's Jira Issue tracking and project management application. BEA has an investment in XMLBeans as well as a standard of shipping high-quality software. This means BEA has an interest in the quality of projects like XMLBeans. Since XMLBeans is an open-source project and uses Apache's common tools like Jira, the issue becomes one of how BEA can track quality metrics of XMLBeans.

The project to uncover some of the new features in XMLBeans 2.0 was a response to the question: How can we easily gather quality metrics from Jira?

The following screen shot shows the main project page for XMLBeans. If you look on the right-hand side, under the Project Summary section, you have options for seeing issues related to some of the quality metrics we are concerned about.

Figure 1
Figure 1: XMLBeans Jira Project Page (click the image for a full-size screen shot)

What is nice about Jira is it provides different views of the issue data. In the following picture, look at the heading titled Current View. In the screenshot, the Browser view is currently selected, but other options include a print view, an XML view, and even an Excel spreadsheet view:

Figure 2
Figure 2: XMLBeans Jira Issue Navigator (click the image for a full-size screen shot)

After we had become familiar with Jira and how XMLBeans tracks quality metrics, we had several ways to gather quality metrics. Our options included screen scraping HTML, parsing a spreadsheet, and getting XML from a URL. We decided it made the most sense to use the XML view from the URL provided by clicking the XML link from the Issue Navigator page. The contents of the URL looks something like the XML document below:

<?xml version="1.0" encoding="utf-8" ?>
<!--  RSS generated by JIRA 98 at Sun Dec 04 18:08:34 CET 2005
<rss version="0.92">
    <title>ASF JIRA</title>
    <description>This file is an XML representation of some
      <title>[XMLBEANS-232] Fast Xml Infoset</title>
      <!-- left out for brevity -->
      <key id="12326193">XMLBEANS-232</key>
      <summary>Fast Xml Infoset</summary>
      <type id="4">Improvement</type>
      <priority id="3">Major</priority>
      <status id="1">Open</status>
      <reporter username="rrusin">Rafal
      <created>Wed, 30 Nov 2005 13:29:44 +0100
      <updated>Sat, 3 Dec 2005 18:15:10 +0100
        <comment author="dandiep" created="Sat, 3 Dec 2005
      18:15:10 +0100 (CET)" level="">
          <!-- ... -->
      <!-- left out for brevity -->

If we look at the snippet from the XML feed above, we see it's defined as an RSS feed. The first step we took was to find an XML Schema for RSS version 0.92 so we could compile the schema and use XMLBeans to parse the URL using XMLBeans' simple JavaBeans-like API. We never found an official schema, but we did find a specification and began creating a schema from there. Along the way, we found that the schema we created for the specification did not match the RSS feed we got from Jira. What were we to do? Our only real option was to create a schema just for this RSS feed, but that was time consuming and error prone. After doing a little more investigation, we stumbled upon the new inst2xsd feature.

Schema to Instance to Schema

The inst2xsd tool is available as a command-line utility, but you can also use the APIs programmatically. Its purpose is to take an XML instance and create a valid set of schemas. The tool is also very configurable and provides options for specifying which design pattern to use (including Russian Doll, Salami Slice, Venetian Blind; look at this set of Schema Design Guidelines for more information). The tool also has the ability to map enumerations to repeated values and create types based on the lowest common denominator of data types.

As an example of creating lowest common denominator types, let's use the value, lcd:val. The text can be represented by several built-in XML Schema datatypes such as several string derived types ( xsd:string, xsd:normalizedString, xsd:token, and so on) as well as the QName type. In this case, the way the inst2xsd feature determines the type is by looking for a namespace declaration with a prefix of lcd. If the prefix is found, the type will be the QName rather than one of the possible string-based types.

Let's take a look at what the results were for the RSS feed we received from Jira. If we had saved the feed to an instance titled jiraRssFeed.xml and placed the XMLBEANS_HOME\bin on our path, our workflow may have looked like the following:

Generates XMLSchema from instance xml documents.
Usage: inst2xsd [opts] [instance.xml]*
Options include:
    -design [rd|ss|vb] - XMLSchema design type
        rd  - Russian Doll Design - local elements and local types
        ss  - Salami Slice Design - global elements and local
        vb  - Venetian Blind Design (default) - local elements and
              global complex types
    -simple-content-types [smart|string] - Simple content types
                       detection (leaf text). Smart is the default
    -enumerations [never|NUMBER] - Use enumerations. Default 
                                   value is 10.
    -outDir [dir] - Directory for output files. Default is '.'
    -outPrefix [file_name_prefix] - Prefix for output file names.
                                    Default is 'schema'
    -validate - Validates input instances against generated
    -verbose - print more informational messages
    -license - print license information
    -help - help information

/home/user>inst2xsd jiraRssFeed.xml -enumerations never 
                                  -design rd -verbose -validate
# this generates a schema named schema0.xsd

This will produce a file title schema0.xsd (this is configurable), and the schema will look similar to the snippet below:

 1 <?xml version="1.0" encoding="UTF-8"?>
 2 <xs:schema attributeFormDefault="unqualified"
 3  <xs:element name="rss">
 4    <xs:annotation>
 5      <xs:documentation>RSS generated by JIRA 98...
 6    </xs:annotation>
 7    <xs:complexType>
 8      <xs:sequence>
 9        <xs:element name="channel">
10          <xs:complexType>
11            <xs:sequence>
12              <xs:element type="xs:string" name="title"/>
13              <xs:element type="xs:anyURI" name="link"/>
14              <xs:element type="xs:string" name="description"/>
15              <xs:element type="xs:string" name="language"/>
15              <xs:element name="item" maxOccurs="unbounded"

From the snippet, we see that all of the elements we need for the Jira RSS feed have been defined.

If you want to work the other way, starting from an XML Schema, the latest release of XMLBeans provides this ability. The xsd2inst tool is a way for you to create a sample document from a schema and a global element; the instance will contain values for simple types. Using both of these tools makes working with XML instances and schemas much simpler.

At this point in our project, we have a schema we can use along with the scomp utility to create an XMLBeans type jar and get started working on our business logic and the quality metrics we have been trying to gather.

We know from looking at the Jira RSS feed instance that the bug details we care about are in an element named "item," and the resulting schema makes the item element an array. This means if we want to get information that may be occurring in all items, we will need to iterate through all items. Let's take a look at how we could do this with some code. In the following code, we are going to get all the issues opened by a user with the name specified as a method parameter:

 1 public Vector getItemsFromReporter(String reporter) {
 3  // Get the Jira RSS feed instance from a URL
 4  URL jiraFeedUrl; = new URL("<JiraFeedURL>");
 6  // Get instance objects
 7  RssDocument rssDoc = RssDocument.Factory.parse(jiraFeedUrl);
 8  RssDocument.Rss rss = rssDoc.getRss();
 9  RssDocument.Rss.Channel channel = rss.getChannel();
11  // We will use this object to get most of our data
12  RssDocument.Rss.Channel.Item[] items = channel.getItemArray();
14  //We will store all of the valid results in a vector
15  Vector results = new Vector();
17  for (int i = 0; i < items.length; i++) {
18   RssDocument.Rss.Channel.Item item = items[i];
20   //Add item to results vector when reporter == username
21   if(item.getReporter().getUsername().compareTo(reporter) == 0)
22     results.add(item);
23   }
24  }
26  return results;
27 }

As you can see, this is very clean Java. However, there are performance implications in using this code when the number of items grows large. In the latest release of XMLBeans, two new features were created to help with just these kinds of issues. The first is support for JDK 5.0 generics, and the second is better support for XPath and XQuery. Let's look at how we can use generics with XMLBeans.

Pages: 1, 2, 3, 4

Next Page ยป