Choosing the Right Product
Each software development project has its own unique set of data management requirements. Attributes such as speed, scale, reliability, integrity, concurrency, availability, ease of administration, etc. help to narrow the available choices. Choosing the right solution can be a daunting task and sometimes it seems that only a custom solution will work. The Berkeley DB product family is designed to be flexible, to allow for customization, and to fit into these complex and seemingly unique situations.
Berkeley DB and Berkeley DB Java Edition
Berkeley DB and Berkeley DB Java Edition have many similarities. The first major difference between Berkeley DB and Berkeley DB Java Edition is implementation language. Berkeley DB is written in ANSI C, whereas Berkeley DB Java Edition is written in pure Java. If your goal is to build a pure Java solution for cross-platform deployment or simpler development, then choose Berkeley DB Java Edition. It is packaged as a single small Java Applications Archive (JAR) file, so you may simply drop it in and go.
If you don't require a pure Java solution, you may consider deploying Berkeley DB. Berkeley DB supports a Java API through Java Native Interface (JNI), so it requires both a native library and a JAR to function. Berkeley DB supports concurrent use by any number of language APIs. If your application requires diverse language support (such as C, C++, Perl, PHP, Python, or any number of other community supported language APIs), choose Berkeley DB. There are other, more technical, differences between Berkeley DB and Berkeley DB Java Edition that you should evaluate.
Another major difference between Berkeley DB and Berkeley DB Java Edition stems from their deep integration into their respective native environments, ANSI C and Java. Java with its different Editions (Micro, Standard, and Enterprise) and ANSI C with its common system libraries (libc, etc) present very different base level services. Berkeley DB and Berkeley DB Java Edition are designed to fully integrate with and exploit their respective environments. These differences between basic system services led to fundamentally different database storage designs. The database formats are also different; they are each optimized for their respective environments. Berkeley DB uses a paged database file, shared memory files for locks, and a set of log files for transaction management. In contrast, Berkeley DB Java Edition stores all data and transaction information within a single set of log files. This allows Berkeley DB Java Edition to use record level locking whereas Berkeley DB uses page based locking. The differences in storage format lead to differences in performance characteristics. Berkeley DB Java Edition's write-once append-only log based database design is optimized for sequential writes. Berkeley DB's page backed data base is likely perform better for random I/O and datasets that exceed the available cache. The APIs for the Java interfaces to these products is very similar; you should explore the performance curve for your application using both databases.
There are a number of smaller differences between Berkeley DB and Berkeley DB Java Edition. Berkeley DB supports many access methods (btree, hash table, record, queue), Berkeley DB Java Edition supports only one (btree) however the others are easily implemented using the Java Collections API. Berkeley DB is free threaded, meaning that it uses no threads itself but is thread and multi-process safe. Berkeley DB Java Edition is also thread safe. It allows for limited multi-process support (one writer, many snapshot readers). It uses a few threads for house keeping functions. There are a number of other smaller differences as well, but fundamentally the two products perform similar functions.
Berkeley DB XML
The choice between Berkeley DB and Berkeley DB XML is somewhat easier. It is certainly possible to store XML content within Berkeley DB. Any format of data can be stored in Berkeley DB as it has no knowledge of the structure or schema of the data it stores. If your use of XML requires query based access, update, reshaping, or any other services that require knowledge of the structure of the data then Berkeley DB XML is the right choice. Berkeley DB XML layers an XML storage management system on top of DB. With it XML is examined and indexed when inserted into the database. Data can be easily accessed and modified using XQuery. Berkeley DB XML is implemented in C++ as a layer on top of the ANSI C version of Berkeley DB. There is a Java API using the JNI, but it is not available as a pure Java product. |