2.4 Release Overview
The major focus areas for release 2.4 include:
- Implementation of declarative update expressions,
conforming to the Last Call Working Draft of W3C's XQuery Update 1.0.
- Scalability improvements in terms of reducing memory
footprint required during query processing and results handling.
- Performance improvements by using more intelligent,
cost-based query processing.
- Performance improvements by using iterator-based
processing instead of tree-processing.
- General bug fixes and improvements.
Berkeley DB XML 2.4.13 Change Log
Upgrade Requirements
Upgrade is only required for containers created using
release 2.2.13 or earlier. Containers creating using 2.3.X do not require
upgrade. Most queries will benefit from reindexing 2.3.X-based containers
to add new structural statistics information used in query cost analysis.
Reindexing is required in order to enable the substring index to
be used on 1- and 2-character strings (a new feature in 2.4).
If an upgrade is performed (e.g. from 2.2.13) it is
recommended that the resulting container be run through dbxml_dump/dbxml_load
to reduce its file size.
New Features:
- Conformance to Last-Call Working Draft of XQuery
Update 1.0
- Added the ability to use "document projection" when
querying whole-document containers. This performance and memory optimization
results in only materializing that portion of the document relevant
to the query.
- Partial document modifications will now only reindex
those portions of the document(s) affected by the modification itself.
This is a significant performance enhancement for partial update of
large documents.
API Changes:
Unless otherwise noted, the API additions apply to
all language bindings, and all bindings use the same method name.
- Added the DBXML_DOCUMENT_PROJECTION flag to the various
query interfaces to enable use of this feature. In Java, this behavior
is controlled by XmlDocumentConfig.setDocumentProjection()
- Added a new XQuery extension function, dbxml:contains(),
that allows case- and diacritic-insensitive string searches and can
be optimized by a substring index
- Removed all C++ interfaces that used Xerces-C
DOM, including:
- XmlDocument::setContentAsDOM()
- XmlDocument::getContentAsDOM()
- XmlValue::asNode()
- XmlModify is now a deprecated (but still-supported)
class. XQuery Update should be used instead. One method has been removed
as it is no longer supported by the internal infrastructure: XmlModify::setNewEncoding()
- Added a new constructor to XmlEventReaderToWriter
that allows an XmlEventWriter instance to be multiple-use. This makes
it possible to write to it from multiple reader sources to concatenate
content, for example.
- Removed XmlUpdateContext::{get,set}ApplyChangesToContainers()
methods. This behavior is no longer controllable. Changes to documents
that are in containers will always be written. If transient changes
are required, content must be copied (XQuery Update has syntax to do
this directly)
- Removed unused variant of XmlValue::asString(std::string
&encoding). This variant never actually changed the encoding [#15822]
- Added DBXML_STATISTICS and DBXML_NO_STATISTICS flags
to enable/disable creation of an additional statistics database that
is used for query optimization. The default is to create the database.
Upgraded containers will NOT have a statistics database added unless
they are explicitly reindexed. The cost of this optimization is a bit
of extra work during document insertion. In Java this behavior is controlled
by XmlContainerConfig.setStatisticsEnabled()
- Added the DBXML_NO_AUTO_COMMIT flag, which can be
specified to the XmlQueryExpression::execute() methods to turn off auto-commit
of update queries when it is not appropriate.
- Some XmlException error codes have changed - DOM_PARSER_ERROR
and NO_VARIABLE_BINDING have been removed, XPATH_PARSER_ERROR is now
called QUERY_PARSER_ERROR, and XPATH_EVALUATION_ERROR is now called
QUERY_EVALUATION_ERROR. [#15792]
- The enumeration, XmlQueryContext::DeadValues, has
been removed. The related method, XmlQueryContext::setReturnType() remains
but is a no-op. All results are LiveValues. This will not affect the
vast majority of applications.
- Added XmlQueryExpression::isUpdateExpression() to
allow users to know whether an expression is updating or not
Changes That May Require Application Modification:
- Some of the API changes above may result in the need
to make minor code changes
- Not all modification patterns that use XmlModify
will continue to work given the new infrastructure. Specifically, operations
that would both copy and delete the same content (emulating a "move"
operation) may not work. In all cases, such code can be rewritten to
use XQuery Update directly, resulting in simpler code. Special attention
should be paid to multi-step operations that include such side effects.
- Changed default indexing type on containers to be
node indexes for node storage containers and document indexes for document
storage containers [#15863].
- Java only -- the function XmlContainer.getNode()
function has changed its signature and will require code change if used.
See details below under "Java-specific Functionality Changes."
General Functionality Changes:
- Partial document modification will result in only
reindexing those portions of document(s) affected by the modification
- The system now keeps better cost information and
statistics and the query optimizer uses this information to perform
more effective cost-based optimization
- The content processing internals have been reworked
to make heavy use of iterators and temporary Berkeley DB databases to
significantly reduce the memory footprint of query handling as well
as reduce the number of objects created and destroyed by a query. This
leads to a more scalable, high-performance system
- Substring indexes will now work on any length search
string (e.g. 2-char) rather than be restricted to a 3-character minimum.
Reindexing the container is required to get this functionality.
- Various fixes and memory leak elimination in XmlEventWriter[#15405]
- Fixed a problem where removing a default index could
remove index entries for an overlapping non-default index[#15412]
- Changed semantics of XmlQueryContext::setNamespace()
to treat an empty namespace prefix as the default element namespace[#15630]
- Fixed problem in XmlModify that could result in malformed
XML if a prefixed element name were used without a mapping for that
prefix[#15586]
- Fixed URI resolution code to not add the
base URI when the URI being resolved is absolute[#15583]
- Fixed code to force explicit transactions (vs auto-commit)
when using XmlContainer::putDocumentAsEventWriter. This is necessary
because of the 2-part nature of this interface [#15578]
- Fixed a crash that could occur if XmlResults.next()
were called at the end of a result set [#15621]
- Fixed a bug where XQuery expressions involving unused
global variables were not being optimized correctly [#15661]
- Fixed problem in XmlEventReader::nextTag() where
it would mistakenly throw an exception on character data. Also changed
semantics of XmlEventReader to always return start and end document
events so that callers can know when content starts and ends [#15686]
- Fixed case where the '>' character was not being
escaped properly (according to the XML specificiation). This case is
when it occurs in the sequence, "]]>" [#15739]
- Fixed a problem in XmlModify where removing a node
that was the last child and had leading text could cause a SEGV [#15615]
- Enhanced XmlEventReaderToWriter API to not unconditionally
close the XmlEventWriter object, allowing a single XmlEventWriter to
be used more than once via that API. This allows XmlEventReaderToWriter
to be used for example, to coalesce a number of results into one document
[#15446]
- Fixed a problem with XmlIndexLookup where a GT lookup
that happens to start with the last entry in the index might return
results when it should return none[#15408]
- Fixed a bug which incorrectly reported an error for
fractional seconds when the seconds filed was "59" [#15389]
- Fixed a problem in statistics calculation for substring
indexes that could cause a crash in fn:contains() [#15823]
- Fixed a bad exception that might be thrown when inserting
a schema-invalid document, due to the length of the error message [#15824]
- Fixed a problem where a query that uses an XmlDocument
that has just been "put" into a continer as context for the query might
hang if done in the same transaction as the putDocument call[#15905]
- Fixed a problem where querying empty CDATA sections
could cause an assertion failure or bad memory refernce[#15906]
- Fixed a latent bug that could result in missing index
entries after updateDocument or modifyDocument call. This is very obscure
and has never been seen by a user. It requires an odd combination of
indexes and updates[#15943]
- Fix open/close race condition on XmlContainer. An
application that concurrently opened/closed XmlContainer objects (not
recommended...) could possibly reference bad memory [#15890]
- Fixed several memory leaks that could occur if deadlock
exceptions are thrown during document processing (most likely put and
delete)
- All update operations now work inside internal child
transactions to ensure that they are properly aborted if necessary.
This is not user visible
- Internal buffer size for DB get operations on nodes
is tuned to the calling operation (bulk vs single get)[#15607]
- Fixed a bug in XmlEventWriter where the behavior
was dependent on an uninitialized variable [#15968]
- Changed dbxml_load_container to take a '-e' flag
that causes the program to stop document loading in the event of a parse
error. The default is to continue with the next document [#15777].
- Fixed problem in XmlModify where newly-inserted element
content could cause a bad memory reference and/or crash while calculating
a new node id [#15974].
Utility Changes:
- The dbxml shell added commands:
- setProjection allows control of the document
projection feature
- The dbxml shell can be invoked using the #! syntax
in a *nix shell command, e.g. with the first ilne:
#!/dbxml -s
- Handling of '#' comment lines in the dbxml shell
has been improved so that they can occur anywhere in a line [#15689]
Java-specific Functionality Changes:
- Fixed a problem where committed or aborting a transaction
that was already committed or aborted could crash, especially after
a failed XmlManager.openContainer() call [#15729]
- Is it no longer be necessary to explicitly delete
objects of type XmlValue, XmlDocument, XmlQueryContext, XmlMetaData,
XmlMetaDataIterator and XmlUpdateContext. They are implemented entirely
as pure Java objects with no native memory to release. It will still
be necessary to explicitly delete other Java objects to release native
memory. In general the validity of XmlValue and XmlDocument objects
returned via XmlResults (queries, index lookups, etc) is under control
of the XmlResults object. When the XmlResults object is deleted node
values that may have been associated with that object may no longer
be accessible and an exception will be thrown if accessed [#15194]
- Added -source 1.5 -target 1.5 to Java builds to be
explicit, especially for Windows binary build. The current code *will*
work with 1.4 or 1.6 if the arguments are changed (manually) [#15986]
- The function XmlContainer.getNode() function has
changed its signature. Instead of XmlValue it now returns XmlResults.
The XmlValue that was previously returned can be retrieved using XmlResults.next().
There will never be more than one value in this result. It is necessary
to explicitly delete the returned XmlResults object (XmlResults.delete())
when the application no longer needs access to the returned value. Once
deleted the information in the XmlValue may no longer be accessible.
Python-specific Functionality Changes:
- Fixed XmlEventReader in Python so that methods returning
unsigned char * would be mapped properly into Python strings [#15608]
- Changed implementation of XmlException and related
classes to make them part of the dbxml (vs _dbxml) module [#15617]
- Changed names of XmlException attributes to start
with lower-case letters. See src/python/README.exceptions.
- Moved examples to dbxml/examples/python directory
and added some additional basic examples
PHP-specific Functionality Changes:
- Fixed code that resulted in build and runtime errors
on 64-bit platforms. One symptom was "std::bad_alloc" exceptions. The
issue was a mix of 64- and 32-bit types resulting in attempts to allocate
huge amounts of memory [#15587]
- Fixed compilation problems in a threaded (ZTS) environment
related to the use of incorrect macros in a few places [#15746]
- Fixed problem (SEGV) constructing XmlIndexLookup
objects as well as several other problems with this class implementation
[#16168]
- Moved examples to dbxml/examples/php
- Fixed XmlValue constructor to accept explicitly typed
strings [#15996]
Perl-specific Functionality Changes:
- Moved examples to dbxml/examples/perl
Example Code Changes
- Added examples/cxx/xerces directory with example
code that provides the same functionality that the Xerces-C DOM interfaces
previously provided. They are written as example code to illustrate
an integration with Xerces-C DOM and to also illustrate use of the XmlEvent*
classes for such an adapter
- Moved examples for all languages to dbxml/examples/*
to consolidate them and make packaging simpler
Configuration, Documentation, Portability and Build
Changes:
- XQilla 2.0 is bundled. XQilla 2.0 is released under
a permissive (Apache) license
- Windows static build projects are included
- Project and solution files for Visual Studio version
8.00 have been added for use by Visual Studio 2005 and later releases.
The new solution file is BDBXML_all_vs8.sln.
- Added Berkeley DB project files to the BDB XML build_windows
directory for Visual Studio 7.1 and 8 builds. This means that the included
DB projects will be built directly in the BDB XML tree and not in the
Berkeley DB tree. This does not apply to the VC6 projects and workspace
and does not affect where the default build installs executables and
libraries. VS7.1 project files for Berkeley DB examples are no longer
included.
|