Long-Term Persistence for JavaBean


Long-Term Persistence for JavaBeans

By Philip Milne & Kathy Walrath    

(Note: After you've read this article, please see the update.)

At JavaOne '99 we gave a preliminary talk on a new persistence model for Swing that would allow Swing user interfaces to be "serialized" as XML documents. As we found out from the BOF sessions after the talks, there was a great deal of interest in this topic, much of it on the more general problem of archiving graphs of JavaBeans as XML documents.

Since then we've been working with the IDE vendors to generalize and refine these techniques to deal with the practical issues that arise in saving real designs as constructed by commercial tools. The recent surge of interest in Swing IDE's has brought the question of interoperability to the fore. At the heart of this issue is the question of persistence and how a design can be saved in a format that is not tied to the tool that created it.

Now it's time to solicit feedback from developers working outside of the IDE space. We invite you to:


Design and Implementation of the Persistence Model


This section describes the following aspects of the design and implementation of the persistence model:


  • Resilience to changes in the versions of both VMs and class libraries.
  • Fault tolerance, allowing an archive to load even when  part of it is damaged.
  • Archives exclusively written in terms of public APIs, not private implementations.
  • Textual (ASCII) output that can be edited with standard tools (.xml and .java output options).
  • Comprehensive use of defaults (redundancy elimination) to minimize file size.
  • Performance that varies linearly with the number of nodes in the graph.

A Note on Marshaling vs. Archiving

There are two common approaches to the problem of persisting graphs of objects:
  • Recording all state in an object graph, including non-public state.
  • Recording all state that can be reconstituted using the public APIs of the objects in the graph.
The simplest scheme taking the first approach requires the inclusion of all the classes that define the objects -- which is too expensive. The practical alternative, which is to refrain from serializing the byte codes that define the classes themselves, is workable between identical implementations of the same libraries. The serialization framework in 1.1 implements this and is therefore the method of choice for sending faithful copies of an object graph between two similarly configured VMs.

The second approach cannot produce as faithful a copy of the original objects as the first but can store the state of the graph in such a way that any API-compatible implementation of the classes involved will be sufficient to reconstitute it. Since APIs are so much more stable than their private implementations, this single step virtually solves the versioning issues for most practical purposes. As importantly, from an applet perspective, the behavior of the constructors residing on the client machine can be leveraged, often dramatically reducing the size of the files that need to be transferred.

Luckily, although the serialization APIs in JDK 1.1 provided direct support for only the first task, they used a series of interfaces to ensure that support for the second operation could easily be accommodated by the same framework. We use this framework to implement the ObjectOutput and ObjectInput interfaces and house a complimentary scheme which is designed to solve the long term persistence problem for user interfaces of JavaBeans.

The Archiver Package

The archiver package generates archives that depend only on the public APIs of the Beans in the archive, and not on the state of any private implementation. Like modern programming languages, the implementation of the archiver deals with the syntactic and semantic elements of this task separately. This architecture allows us to include support for some popular formats now while allowing users to plug in other "syntax-modules" to support other file types in the future.

This formal separation allows the majority of the internal architecture and any special-case code that changes the way the state of a particular class is archived to be written in a form that is independent of the syntax of the output format.  Given both the proliferation of new XML standards and their rapid evolution, this accommodating rather than defining role seems to be the best way to provide Beans with a long-term persistence strategy that can coexist with this evolutionary process.

To test this approach we have implemented two very different formats as examples: a declarative XML format and a procedural Java-like scripting language. We've also implemented a third, output-only format that produces compilable Java files. The formal separation of evaluation semantics has also proven useful in the internal implementation of our redundancy elimination mechanism, which requires the write-time evaluation of the statements being written to the output.

The archiver package currently supports three file formats:

  • XML files that use a declarative DTD and conform to the W3 specification. The corresponding reader and writer classes are XMLInputStream and XMLOutputStream .
  • Java files. The corresponding class is JavaOutputStream . Due to the complexity of this format, we don't provide a corresponding input stream. The object graph can be retrieved by compiling the Java files and loading the resulting classes.
  • "BeanScript" files, which have a format similar to Java files, but simplified so that the files can be parsed easily. The corresponding reader and writer classes are BeanScriptInputStream and BeanScriptOutputStream .
file formatsarchiverArchiver API Documentation

How to Use the New Output Streams

ObjectOutputStream ObjectOutputStream
try {
    ObjectOutput os = new  
    os.writeObject(new JButton("Press Me"});
} catch(Exception e) {

Then the code for writing an XML document requires just one small change:

try {
    ObjectOutput os = new  
    os.writeObject(new JButton("Press Me"});
} catch(Exception e) {


The Persistence Model


To handle all these extra classes the first requirement is that we are able to create instances of them. In all the special cases, this requires extra information that describes how a new instance should be created. For most classes this extra information simply associates the arguments of a chosen constructor with names of the properties they represent. So, for example, the java.awt.Color class is augmented with meta data recording the fact that the three integers that appear as arguments in one of the constructors are the red, green, and blue properties of the new instance. Given this extra information the recording of a Color object is reduced to the simpler problem of recording the Integer values of those properties.

In other cases, such as java.lang.Method, the extra information is used to indicate to the output streams the fact that, instead of using a constructor, the static getMethod method in java.lang.Class should be used to retrieve instances of the Method class. Near the bottom of these recursive definitions are the wrappers for the primitive types of the Java virtual machine: the Boolean class and the Number derivatives. All of these classes have a useful invariant in that they may be reconstructed by calling their single-argument constructor using the value returned by the toString method.  Closing this recursive definition, then, is the java.lang.String class, for which each file format must provide built-in support, and in terms of which all other objects will be represented.




smallerRedundancy Elimination



Even with the much improved footprint of inner classes in 1.2 the generation of classes is still a potentially costly solution if an inner class is generated for each action in a user interface. Our example builder, Bean Builder, demonstrates how instances of the java.lang.reflect.Proxy API can be used to create "trampoline" objects that can be installed as listeners to arbitrary events and used to call a given method on a target object when the event takes place. The proxy APIs are used to synthesize listeners of arbitrary types at runtime – instead of having to compile code. The key thing about this technique is that we generate one "trampoline" class per event type, which is a considerable saving over techniques that generate one class per event. The result is that the incremental cost of wiring, for example, a button to a method in a target  object is the footprint of an instance of the "trampoline" class rather than the footprint of the class itself. This will typically save between one and two orders of magnitude in the overall footprint of the event-handling code in an application.

Most importantly, the "trampoline" class that we have implemented exposes all of its state using the Beans conventions. It can therefore be archived in the same way that any other bean is archived – as a textual representation of its public properties.


Downloading the Archiver Package

Note: This version of the archiver works only with the 1.3 version of the Java 2 SDK.
archiver.zip README.txtThe Archiver PackageHow to Use the New Output Streams

Downloading the Bean Builder

Note: This version of the Bean Builder works only with the 1.3 version of the Java 2 SDK.

To show how the new streams can be used in an IDE environment to save the designs that a user creates, we have built a simple BeanBox-style PropertyEditor/GUI builder and included support for persisting designs, including event handling, as XML documents.

Click to enlarge

Like the original BeanBox, the builder is not a commercial product and is intended to serve only as an example of how these techniques would be used in a real IDE. This builder takes the original BeanBox concept forward a little by showing not just how the properties of a single Bean can be manipulated but how a group of Beans can be "wired up" to make the user interface part of an application.

To try out the builder, download beanbuilder.zip (~580 KB), unzip it, and follow the instructions in README.txt .

Left Curve
Java SDKs and Tools
Right Curve
Left Curve
Java Resources
Right Curve