Incremental Compilation and Error Handling in XMLBeans

by Hetal Shah
04/18/2007

Abstract

XMLBeans version 2.1.0 provides programmatic access to create and update XMLBeans from schema artifacts, and to validate and capture errors that may occur during schema compilation and parsing of an XML document. In any enterprise applications ranging from Web service client/server to CRM and EAI products, as products mature the need arises to update existing schema artifacts. Moreover, everyone wants to simplify the maintenance of applications processing XML documents based on these schema artifacts. The XMLBeans API provides a simple and efficient way to create and maintain XMLBeans and offers rich custom error handling and reporting capabilities, both contributing to solving these problems.

This article illustrates these features with a series of examples. It assumes you have a familiarity with XMLBeans. For an XMLBeans primer, see the References section. The example code and other files mentioned in this article are available for download.

Introduction

Java-based enterprise applications often use Java-XML binding libraries as an underlying layer to access and process XML data in a familiar, Java friendly way. Using XMLBeans has become popular as a Java-XML binding solution because of its unique features such as lazy unmarshalling, cursor-based access to XML data, and support for XQuery. You create XMLBeans by running scomp on an XML schema; however, using scomp to create a new set of XMLBeans each time schema documents change is not a sophisticated approach. Because a substantial part of the cost of any enterprise application is its maintenance, programmatic access to schema compilation and XMLBeans generation can represent significant savings in cost and time in the long run.

XMLBeans are composed of a set of Java binding classes and a bunch of XSB files containing binary schema metadata compiled from the schema documents. Each of the XSB files represents a compiled schema type, an attribute, or an element definition. The org.apache.xmlbeans package declares the following two interfaces to work with compiled schema definitions:

  • SchemaTypeLoader - this interface class represents a searchable set of compiled schema definitions, and it is frequently consulted to resolve wildcards and xsi:type attributes
  • SchemaTypeSystem - this extends the SchemaTypeLoader class and provides features to enumerate over all the available compiled schema definitions

The XmlBeans class declares a static method getBuiltinTypeSystem() that returns precompiled, built-in schema types. Normally, the SchemaTypeLoader instance being used is the context type loader returned from XmlBeans.getContextTypeLoader(). The context type loader reads compiled schema definitions available on the classpath and loads them into its searchable set of schema definitions. If you wish to use a different SchemaTypeLoader, then you must call XmlBeans.loadXsd(XmlObject[]), which returns a SchemaTypeLoader object loaded with compiled schema definitions declared inside schema documents passed in as XmlObject[]. Another static method, typeLoaderUnion(SchemaTypeLoader[]) of the XmlBeans class, returns a union of type loaders.

With this background established, let's now look at the org.apache.xmlbeans.XmlBeans class, which provides the following additional methods to compile XML schema documents:

  • SchemaTypeSystem compileXsd(XmlObject[] , SchemaTypeLoader , XmlOptions) - compiles given XML schema documents and returns a SchemaTypeSystem object loaded with schema definitions declared in those schema documents
  • SchemaTypeSystem compileXsd(SchemaTypeSystem , XmlObject[] , SchemaTypeLoader , XmlOptions) - returns a SchemaTypeSystem object updated with given XML schema documents
  • SchemaTypeSystem compileXmlBeans(String , SchemaTypeSystem , XmlObject[] , BindingConfig , SchemaTypeLoader , Filer , XmlOptions) - updates the given SchemaTypeSystem object with the XML schema documents, and optionally allows you to create XMLBeans (that is, Java and XSB files) controlled by optional binding configuration parameters

The compileXmlBeans() method provides more functionality than the overloaded compileXsd() methods, therefore compileXsd() is not discussed further in this article. Let's look at the parameters of the compileXmlBeans() method:

  • String name - an optional argument to name the compiled schema type system; if null, a random value is used
  • SchemaTypeSystem system - an optional argument that represents a set of already compiled schema definitions
  • XmlObject[] schemas - an array representing schema documents
  • BindingConfig config - an optional argument to provide configuration information during code generation; for a primer on XMLBeans configuration, see Configuring XMLBeans (Dev2Dev, November 2004)
  • SchemaTypeLoader typepath - an optional argument that, if provided, will be consulted for already compiled schema definitions; if it is not specified, a context type loader is used.
  • Filer filer - an optional argument used to create Java and XSB files of XMLBeans; if null, XMLBeans won't be created and the config parameter is not used
  • XmlOptions options - an optional argument to specify validation behavior and error listener

Schema Compilation

Now that I have described how useful programmatic access to XMLBeans can be and what the main methods for compiling beans are, let's put it all into practice by compiling the sample schema location.xsd. I will compile location.xsd, and print global elements and attributes declared in it. Listing 1 shows how to do this.

Listing 1: This excerpt from CompileSchemaDefinitions.java compiles a schema and dumps the schema's global elements and attributes to System.out.

XmlObject[] schemaObj = new XmlObject[]

    {XmlObject.Factory.parse(new File(args[0]))};

SchemaTypeSystem  schemaTypeObj =

   XmlBeans.compileXmlBeans(null, null, schemaObj,

                         null, null, null, null);



// get list of global elements

SchemaGlobalElement globalElementsArray[] =

                schemaTypeObj.globalElements();

..

// get list of global attributes

SchemaGlobalAttribute globalAttributesArray[] =

                schemaTypeObj.globalAttributes();

....

Here, the schema is parsed by the XmlObject.Factory.parse() method, which returns an XMLObject instance. This instance is assigned to an array of size one, which is passed to the compileXMLBeans() for in-memory compilation of schema . In subsequent statements, the returned SchemaTypeSystem object from compileXMLBeans() is used to get a list of global elements and attributes declared in sample schema.

Generating XMLBeans

Now that you have seen how to compile a schema, the next logical step is to see how XMLBeans files are created from the schema with the help of the interface class org.apache.xmlbeans.Filer. Java applications invoking compileXmlBeans() to generate XMLBeans must pass an instance of a concrete class implementing the following two methods of the Filer interface in order to define where the XSB and Java files of XMLBeans should be created and written:

  • OutputStream createBinaryFile(String typename) - returns a java.io.OutputStream object reference for writing binary content of an XSB file
  • Writer createSourceFile(String typename) - returns a java.io.Writer object reference for writing source code to a Java file

Of course, generated Java files need to be later compiled and packaged in a JAR file along with XSB files. The sample code for this article contains a concrete class FilerImpl.java implementing the Filer interface. It contains several additional helper methods, compileJavaFiles() to compile generated Java files, makeJarFile() to package all the generated files into a JAR file, extractJarFile() to extract JAR file contents to a folder, and prependToClassPath() to prepend a path to system property classpath at runtime.

Listing 2: This is an excerpt of the sample concrete class FilerImpl.java implementing the Filer interface.

public class FilerImpl implements Filer{

....

FilerImpl (String folderPath) {

     this.folderPath = folderPath;

  }



 public OutputStream createBinaryFile(String name)

        throws IOException {

        File fileObj = new File(folderPath, name);

....

        return new FileOutputStream(fileObj);

}



  public Writer createSourceFile(String name)

        throws IOException {

...

        File fileObj=

               new File(folderPath, name + ".java");

...

                return new FileWriter(fileObj);

  }

  public boolean compileJavaFiles() {

...

   for (int i=0; i>javaFilePaths.size();i++){

        String[] args = new String[] {"-classpath"

         ,System.getProperty("java.class.path",".") ,

         "-d", folderPath,

         (String) javaFilePaths.elementAt(i)};

     status = javac.compile(args);

   }

...

  }

  public boolean makeJarFile(String jarFile)

      throws IOException {

...

  jarHelperObj.jarDir(new File(folderPath),

                      fileObj);

...

  }



  public boolean extractJarFile(String jarFile,

                                String destDir)

  throws IOException {

...

     jarHelperObj.unjarDir(fileObj,dirObj);

...

  }

 public String prependToClassPath(String filePath)

  throws IOException  {

        String classPath =

            System.getProperty("java.class.path",".");

        System.setProperty("java.class.path",

                          filePath + ";" + classPath);

...

  }

}

Now to compile the sample schema location.xsd and create XMLBeans from it, I will pass an instance of FilerImpl to compileXmlBeans(), as shown in Listing 3.

Listing 3: This excerpt from CreateXMLBeans.java passes an instance of FilerImpl to the compileXmlBeans() method for generating XMLBeans from a schema document.



FilerImpl flrObj =

                  new FilerImpl("outputDIR");

XmlObject[] schemaObj = new XmlObject[]

     {XmlObject.Factory.parse(new File(args[0]))};

SchemaTypeSystem  schemaTypeObj =

        XmlBeans.compileXmlBeans("location", null,

        schemaObj, null, null, flrObj, null);

                         
flrObj.compileJavaFiles();

flrObj.makeJarFile("outputDIR\\locationXB.jar");

                      

CreateXMLBeans.java creates the outputDIR folder in a directory where it is run. This folder will have locationXB.jar, along with two sub-folders: com containing Java and class files, and schemaorg_apache_xmlbeans containing XSB files. The locationXB.jar file can be verified by successfully parsing a valid XML instance document, as shown in the sample WeatherUnmarshal.java from the sample.zip.

Updating XMLBeans

XML schemas will change over time either due to the evolution of data in an enterprise application, or because of an upgrade of the Web service interface. With each new version of XML schema, corresponding XML instance documents and applications processing those XML documents need to be evolved and updated. If changes are made in the original schema document, then a programmatic implementation of updating the corresponding XMLBeans is simple and similar to Listing 3 discussed in the previous section—that is, recompile the entire schema and recreate the XMLBeans files from it.

XMLBeans also supports incremental compilation that involves recompiling only what is necessary to bring the beans up to date with the schema in the shortest time possible. XMLBeans supports incremental compilation by updating existing beans from schema artifacts containing either new schema definitions, or through modification to existing definitions or deletions of already compiled schema definitions.

Incremental Compilation

Now I will add a new global element Latlong declared in the sample, location_add.xsd, and apply a modified definition of the Weather element declared in the sample artifact location_modify.xsd to the beans in locationXB.jar, generated from location.xsd schema in the Generating XMLBeans section.

Listing 4: This excerpt from location_modify.xsd shows a modified Weather element definition adding two local elements FeelsLike, and Winds.

..

 <xsd:element name="Weather">

  <xsd:complexType>

   <xsd:sequence>

...

    <xsd:element name="FeelsLike"

         type="xsd:float"/>

...

    <xsd:element name="Visibility"

         type="xsd:float"/>

...

    </xsd:complexType>

 </xsd:element>

The excerpt shown in Listing 5 prepends the path of a JAR file containing XMLBeans to the system property classpath so that the context type loader can load the compiled schema definitions from that JAR file, and then it combines these compiled schema definitions with precompiled, built-in schema types returned from XmlBeans.getBuiltinTypeSystem(), by calling the XmlBeans.typeLoaderUnion() method. This union of schema definitions is assigned to the variable stl, which is used in the call to the compileXMLBeans() method for updating XMLBeans with schema artifacts.

Listing 5: An excerpt from IncrementalCompilation.java



class IncrementalCompilation{

...

  public static void main(String args[]) {

...

   flrObj.extractJarFile(

      args[oldJARFileArgPosition],

      args[outputFolderPathPosition] );

   flrObj.prependToClassPath(

        args[outputFolderPathPosition]);



   SchemaTypeLoader stl =

       XmlBeans.typeLoaderUnion(

       new SchemaTypeLoader[]

       { XmlBeans.getContextTypeLoader(),

       XmlBeans.getBuiltinTypeSystem() });



...

      for (int i =updatedXSDFileArgPosition,

       j=0; j < updatedXSDDefinitionsSize;

       i++,j++ ) {

         updatedXSDObj[j] =

            XmlObject.Factory.parse(

            new File(args[i]),options);

      }

      augSTSObj = XmlBeans.compileXmlBeans(null,

                  null,updatedXSDObj , null,

                  stl, flrObj, options);

   }



  flrObj.compileJavaFiles();

  flrObj.makeJarFile(args[newJARFileArgPosition]);



...

}

IncrementalCompilation.java needs the location information of an output folder where XSB and Java files will be created using the FilerImpl.java JAR file to package updated beans, the JAR file of existing beans, and schema artifacts listed in the order they should be applied to the existing beans during incremental compilation. Here is an example of calling IncrementalCompilation.java from the command line, where schema definitions declared in location_add.xsd are compiled first, followed by schema definitions declared in location_modify.xsd, to update beans in outputDIR\locationXB.jar, and then updated beans are packaged in outputDIR\locationXB_IncrementalCompilation.jar:

java -classpath  %CLASSPATH%;.\outputDIR\locationXB.jar;

IncrementalCompilation  outputDIR

outputDIR\locationXB_IncrementalCompilation.jar

outputDIR\locationXB.jar location_add.xsd  location_modify.xsd

The JAR file created by IncrementalCompilation.java can be verified by parsing the weather or latlong XML instance documents successfully as shown in WeatherUnmarshal.java and LatlongUnmarshal.java included in the sample download. This sample also includes UpdateUsingXSD.java demonstrating incremental compilation applied to the original XML schema. In UpdateUsingXSD.java, compileXMLBeans() is first called to compile the original schema file and then for incremental compilation of artifacts containing modified schema definitions.

XML Schema Validation

XMLBeans validates the schema document when compiling using scomp or the Ant task, or using the compileXsd() and compileXmlBeans() method calls. However, when the schema is compiled programmatically, the XMLBeans API provides more control over validation. Schema validation can be turned off by calling setCompileNoValidation() on the XMLOptions instance passed to compileXmlBeans() or the overloaded compileXsd() methods.

If a schema is not valid, then XMLBeans throws an XmlException exception. Since in a real production system applications should provide friendly error message, providing error details and instructing the user how to proceed, the XmlOptions class allows an application to specify a Collection instance to capture and store validation errors returned as XmlError objects. XmlError instances represents an error for a specific location in an XML document, with error details such as the line number and column number of the offending location, a brief description, and the severity of the error. The XmlOptions class provides the following helpful methods for validation errors:

  • setErrorListener (Collection) - sets a collection instance to capture and store errors that occur during validation
  • setLoadLineNumbers() - enables marking a line number for start tags during the creation of an XmlObject instance from schema or from an XML document; later, it is used to set the line number of an offending location in XmlError instances
  • setLoadLineNumbers(String tag) - enables marking a line number for start and end tags during the creation of XmlObject instance from schema or an XML document, based on the value of tag parameter

Now that you are familiar with validation and error handling APIs, let's put them into use by parsing an invalid sample schema location_invalid.xsd. I'll show you how to handle validation error and generate custom error message en route. Listing 7 shows how to catch XmlException exceptions thrown during schema compilation and then pass the collection of validation errors to a CustomErrorHandlingUtil class for custom error handling and reporting purposes.

Listing 7: An excerpt from CustomErrorSample.java

class CustomErrorSample{



   public static void main(String args[]) {



   Collection  errorList = new ArrayList();



   XmlOptions parseOptionsObj = new XmlOptions();

   parseOptionsObj.setLoadLineNumbers();

   parseOptionsObj.setLoadLineNumbers(

       XmlOptions.LOAD_LINE_NUMBERS_END_ELEMENT);

   XmlOptions loadOptionsObj = new XmlOptions();



   loadOptionsObj.setErrorListener(errorList);

   try {

         XmlObject[] schemaObj = new XmlObject[]

           {XmlObject.Factory.parse(

           new File(args[0]),parseOptionsObj)};

         SchemaTypeLoader  schemaTypeObj =

            XmlBeans.loadXsd(schemaObj,

            loadOptionsObj);



      }catch (XmlException xme) {

         CustomErrorHandlingUtil eUtilObj =

                new CustomErrorHandlingUtil();

         eUtilObj.handleXMLException(xme);

         Collection userErrors =

          (Collection)loadOptionsObj.get(

            XmlOptions.ERROR_LISTENER);

         eUtilObj.handleErrorCollection(

            userErrors);

      }

....

}

In the above code, two different instances of XmlOptions are used only for demonstration purposes; you can use only one instance of XmlOptions to perform the combined work. parseOptionsObj is used during the creation of XmlObjects from a schema document. loadOptionsObj is used to set an error listener for storing validation errors during schema compilation. If parseOptionsObj was not passed during the XmlObject.Factory.parse() call, then validation errors will have line and column numbers set to -1.

XML Document Validation

By default, XMLBeans does not perform complete validation of an XML instance document during parsing or while updating XMLBeans instances. However, an application can invoke one of the overloaded validate() methods of XMLObject to force the validation:

  • boolean validate() - returns true, if an instance of XMLObject on which this method is called conforms to a corresponding schema definition
  • boolean validate(XmlOptions) - validates using an instance of XmlOptions and returns true if an instance of XMLObject on which this method is called conforms to a corresponding schema definition

The XmlOptions class provides the following two methods to control the validation during validate(XmlOptions) method execution:

  • setValidateTreatLaxAsSkip() - skips validation of elements matching definition with contentModel set to lax
  • setValidateOnSet() - values will be verified on each of the getter and setter methods of XMLBeans representing simple types only. In the event of an invalid value, an exception is thrown. Note that an error listener cannot be used with this method, and changes made using XmlCursor are not validated by this method.

As an example, say the validate() method is called on a bean instance of global element Weather. The scope of validation spans the entire Weather element, including its child elements. However, if the validate() method is invoked on a bean instance of one of the local elements, say Temperature, then only an instance bean representing the Temperature element is validated. Listing 8 shows how to validate an entire XML instance document and capture the validation errors for reporting.

Listing 8: An excerpt of WeatherUnmarshal.java

public class WeatherUnmarshal {

 public static void main(String args[]) {



  ArrayList errorList = new ArrayList();

  ...

  optionsObj.setErrorListener(errorList);

  try {

   ...

   WeatherDocument weatherDoc =

        WeatherDocument.Factory.parse(

        inputXMLFile,optionsObj);



   if (weatherDoc.validate(optionsObj)) {

   ....

   } else {

   // Invalid xml document.

   ...

   customErrorHandlingUtil eUtilObj =

                new customErrorHandlingUtil();

      eUtilObj.handleErrorCollection(errorList);

   }

 }

}

If the XML instance document is invalid, then the custom error handling is invoked using the helper class customErrorHandlingUtil.

Download

The examples have been tested with Apache XMLBeans version 2.1.0 and JDK version 1.5.0_06, and contain sample schema, Java code, and XML files used in this article:

  • Download the code in this article.

Summary

This article began by discussing the challenges of automating and simplifying the maintenance of enterprise applications processing XML documents as the underlying schema evolves. We learned about the programmatic access provided by the XMLBeans API to create and update XMLBeans when the XML schema is changed. Along the way I showed that XMLBeans' support for incremental compilation, which facilitates team-based design and reduces compile time, results in a significant gain when an enterprise application becomes larger and more complex.

I also looked at the validation and error handling features of XMLBeans for schemas and XML instance documents. Validation features of XMLBeans provide fine control over the compile time validation of schema and validation of XML documents and bean instances. XMLBeans error handling features are available for customizing the error message reporting.

References

Hetal Shah is an IT consultant who is highly passionate about Internet-related technologies.