The JE 6.0 file format change is forward compatible in that JE files created with release 5.0 and earlier can be read when opened with JE 6.0 or later. The change is not backward compatible in that files created with JE 6.0 or later cannot be read by earlier releases. Note that if an existing environment is opened read/write, a new log file is written by JE 6.0 or later, and the environment can no longer be read by earlier releases.
One of two utility programs must be used, which are available in the release package for JE 4.1.20, or a later release of JE 4.1. If you are currently running a release earlier than JE 4.1.20, then you must download the latest JE 4.1 release package in order to run these utilities.
The steps for upgrading are as follows.
java -jar je-4.1.20.jar DbPreUpgrade_4_1 -h <dir>If you are using a JE
java -jar je-4.1.20.jar DbRepPreUpgrade_4_1 -h <dir> -groupName <group name> -nodeName <node name> -nodeHostPort <host:port>
The second step -- running the utility program -- does not perform data conversion. This step simply performs a special checkpoint to prepare the environment for upgrade. It should take no longer than an ordinary startup and shutdown.
During the last step -- when the application opens the JE environment using the
current release (JE 5 or later) -- all databases configured for duplicates will
automatically be converted before the
ReplicatedEnvironment constructor returns. Note that a database
might be explicitly configured for duplicates using
DatabaseConfig.setSortedDuplicates(true), or implicitly configured
for duplicates by using a DPL MANY_TO_XXX relationship
The duplicate database conversion only rewrites internal nodes in the Btree, not leaf nodes. In a test with a 500 MB cache, conversion of a 10 million record data set (8 byte key and data) took between 1.5 and 6.5 minutes, depending on number of duplicates per key. The high end of this range is when 10 duplicates per key were used; the low end is with 1 million duplicates per key.
To make the duplicate database conversion predictable during deployment, users
should measure the conversion time on a non-production system before upgrading
a deployed system. When duplicates are converted, the Btree internal nodes are
preloaded into the JE cache. A new configuration option,
EnvironmentConfig.ENV_DUP_CONVERT_PRELOAD_ALL, can be set to false
to optimize this process if the cache is not large enough to hold the internal
nodes for all databases. For more information, see the javadoc for this
If an application has no databases configured for duplicates, then the last step simply opens the JE environment normally, and no data conversion is performed.
If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility
program before opening an environment with JE 5 or later for the first time, an
exception such as the following will normally be thrown by the
com.sleepycat.je.EnvironmentFailureException: (JE 6.0.1) JE 4.1 duplicate DB entries were found in the recovery interval. Before upgrading to JE 5.0, the following utility must be run using JE 4.1 (4.1.20 or later): DbPreUpgrade_4_1. See the change log. UNEXPECTED_STATE: Unexpected internal state, may have side effects. at com.sleepycat.je.EnvironmentFailureException.unexpectedState(EnvironmentFailureException.java:376) at com.sleepycat.je.recovery.RecoveryManager.checkLogVersion8UpgradeViolations(RecoveryManager.java:2694) at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:549) at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:198) at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:610) ...
If the user fails to run the DbPreUpgrade_4_1 or DbRepPreUpgrade_4_1 utility
program, but no exception is thrown when the environment is opened with JE 5
or later, this is probably because the application performed an
Environment.sync before last closing the environment with JE 4.1
or earlier, and nothing else happened to be written (by the application or JE
background threads) after the sync operation. In this case, running the
upgrade utility is not necessary.
For Oracle NoSQL DB users only, record versions are now discarded using a separate eviction step. This means that the record versions can be discarded to free cache memory without discarding the entire BIN (bottom internal node). In general, this makes better use of memory and reduces IO for some workloads.
The improvements to DbCacheSize are as follows.
-je.rep.preserveRecordVersion trueis passed on the command line, more information is output by the utility. See the new Record Versions and Oracle NoSQL Database section of the DbCache javadoc for more information.
-je.log.fileMax LENGTHon the command line as described in the javadoc.
ReplicaWriteException. Previously an attempt to serialize this exception could fail with the following characteristic stack trace when the
StateChangeEventobject was encountered during serialization:
Caused by: java.io.NotSerializableException: com.sleepycat.je.rep.StateChangeEvent at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1181) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1506) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1429) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1175) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1541) at java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:439) at java.util.logging.LogRecord.writeObject(LogRecord.java:470) ...[#23578] (6.1.1)
Before JE 6.0, BIN-deltas were used as a disk optimization only: to reduce the amount of bytes written to disk every time a new BIN version had to to be logged. BIN-deltas would never appear in the in-memory BTrees, and if the most recently logged version of a BIN was a delta, fetching that BIN into the in-memory tree required 2 disk reads: one for the delta and one for the most recent full-BIN version.
Starting with JE 6.0, BIN-deltas can appear in the in-memory BTree. Specifically, if a full dirty BIN is selected for eviction, rather than evicting the whole BIN (and incurring a disk write), the BIN is converted to a delta that stays in the cache. If a subsequent operation needs the full BIN and the delta is still in the cache, only one disk read will be done.
Further disk-read savings can be realized, because many operations can (under certain conditions) be performed directly on the BIN-delta, without the need for the full BIN. However, in 6.0, only a small subset of background operations were modified to exploit BIN-deltas. In JE 6.1, the set of operations that can be performed on BIN-deltas has been extended. Examples of such operations include key searches in BTrees, if the search key is found on a BIN delta and deletion or update of the record a cursor is located on, if the cursor is located on a BIN-delta. These changes affect both internal operations as well as the search, delete, and putCurrent methods of the Database and Cursor API classes.
Typically, thread synchronization during BTree searches is done via latch coupling: at most 2 tree nodes (a parent and a child) are latched at a time. Furthermore, a node is latched in shared (SH) mode, unless it is expected that it will be updated, in which case it is latched in exclusive (EX) mode. Finally, SH latches are not upgradeable to EX latches (to avoid deadlocks and reduce latching overhead).
JE follows this general latch-coupling technique. However, it also has to deal with the JE-specific fact that fetching a missing child node into the cache requires that its memory-resident parent be updated (because the parent points to its children via direct Java object references). As a result, during a JE BTree search every node is potentially updated, which precludes the use of SH latches. To cope with this complication, JE has been using one of the following approaches during its various kinds of BTree searches: (a) use SH latches, but if a missing child needs to be fetched, release the SH latch on the parent and restart the search from the beginning, using EX latches on all nodes this time, (b) do grandparent latching: use SH latches but keep a latch on the grandparent node so that if we need to fetch a missing child of the parent node, the SH latch on the parent can be released, and then the parent can be relatched in EX mode, (c) do latch-coupling with EX latches only. Obviously, (c) is the worst choice, but all of the 3 approaches result in more and longer-held EX latches than necessary. As a result, some JE applications have experienced performance problems due to excessive latch contention during BTree searches.
In JE 6.1, a new latching algorithm has been implemented to replace all of (a), (b), and (c) above. The new algorithm uses SH latches, but if a missing child needs to be fetched, it first "pins" the parent (to prevent its eviction), then releases the SH latch on the parent, and finally reads the child node from the log (without any latches held). After the child is fetched, it latches the remembered parent in EX mode, unpins it, and checks whether it is still the correct parent for the search and for the child node that was fetched. If so, the search continues down the tree. If not, it restarts the search from the beginning. Compared to approach (a) above, this new algorithm may restart a search multiple times, however the probability of even a single restart is less than (a), and each restart uses SH latches. Furthermore, no latches are held during the long random disk read done to fetch a missing child.
com.sleepycat.je.EnvironmentFailureException: Node5(5):... VLSN 3,182,883 should be held within this tracker.or
com.sleepycat.je.EnvironmentFailureException: Node5(5):...end of last bucket should match end of range ...[#23491]
Counting the number of records in a database is now implemented using a disk-ordered-scan (DOS), similar to the one used by DiskOrderedCursor. DOS may consume a large amount of memory, and to avoid OutOfMemoryErrors, it requires that a limit on its memory consumption is provided. As a result, a new method, Database.count(long memoryLimit), has been implemented that takes this memory limit as a parameter. The existing Database.count() method is still available and uses an internally established limit.
This change fixes two problems of the previous implementation (based on the SortedLSNTreeWalker class): 1. There was no upper bound on the memory consumption of the previous implementation and 2. It was buggy in the case where concurrent thread activity could cause full BINs to be mutated to deltas or vice versa.
Iterating over the records of a database via a DiskOrderedCursor would cause a crash if a BIN delta was encountered in the in-memory BTree (because in this case a copy of the BIN delta was created and cached for later use, but the copy did not contain all the needed information from the original). This bug was introduced in JE 6.0.11.
com.sleepycat.je.EnvironmentFailureException: (JE 5.0.97) Environment must be closed, caused by: com.sleepycat.je.EnvironmentFailureException: Environment invalid because of previous exception: (JE 5.0.97) ... java.io.FileNotFoundException: ...\ffffffff.jdb (The system cannot find the file specified) LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is invalid and must be closed. at com.sleepycat.je.EnvironmentFailureException.wrapSelf(EnvironmentFailureException.java:210) at com.sleepycat.je.dbi.EnvironmentImpl.checkIfInvalid(EnvironmentImpl.java:1594) at com.sleepycat.je.dbi.DiskOrderedCursorImpl.checkEnv(DiskOrderedCursorImpl.java:234) at com.sleepycat.je.DiskOrderedCursor.checkState(DiskOrderedCursor.java:367) at com.sleepycat.je.DiskOrderedCursor.getNext(DiskOrderedCursor.java:324) ...[#23676] (6.1.3)
In order to perform write operations in such cases, the application must now call TransactionConfig.setLocalWrite(true).
In addition, it is no longer possible to use a single transaction to write to both a replicated and a non-replicated databases. IllegalOperationException will be thrown if this is attempted.
These changes were necessary to prevent corruption when a transaction contains write operations for both replicated and non-replicated databases, and a failover occurs that causes a rollback of this transaction. The probability of corruption is low, but it can occur under the right conditions.
For more information see the javadoc for TransactionConfig.setLocalWrite(true), and the "Non-replicated Databases in a Replicated Environment" section of the ReplicatedEnvironment class javadoc.
Durability.READ_ONLY_TXN has been deprecated and TransactionConfig.setReadOnly should be used instead.
The conditions that cause the bug are:
If this bug is encountered, it can be corrected by upgrading to the JE release containing this fix, and no data loss will occur.
This bug is similar to another bug that was fixed in JE 5.0.70 [#22052]. This bug differs in that the transaction must write records in multiple databases, and at least one but not all of the databases must be removed or truncated between the two abnormal shutdowns.
com.sleepycat.je.EnvironmentFailureException: (JE 6.1.3) Node1(-1):... last LSN=0x3/0x4427 LOG_INTEGRITY: Log information is incorrect, problem is likely persistent. Environment is invalid and must be closed. at com.sleepycat.je.recovery.RecoveryManager.traceAndThrowException(RecoveryManager.java:3012) at com.sleepycat.je.recovery.RecoveryManager.undoLNs(RecoveryManager.java:1253) at com.sleepycat.je.recovery.RecoveryManager.buildTree(RecoveryManager.java:741) at com.sleepycat.je.recovery.RecoveryManager.recover(RecoveryManager.java:352) at com.sleepycat.je.dbi.EnvironmentImpl.finishInit(EnvironmentImpl.java:654) at com.sleepycat.je.dbi.DbEnvPool.getEnvironment(DbEnvPool.java:208) at com.sleepycat.je.Environment.makeEnvironmentImpl(Environment.java:252) at com.sleepycat.je.Environment.[#22071] (6.1.3)
(Environment.java:232) at com.sleepycat.je.Environment. (Environment.java:188) at com.sleepycat.je.rep.ReplicatedEnvironment. (ReplicatedEnvironment.java:573) at com.sleepycat.je.rep.ReplicatedEnvironment. (ReplicatedEnvironment.java:443) ... [app creates a new ReplicatedEnvironment here] ... Caused by: java.lang.NullPointerException at com.sleepycat.je.log.entry.LNLogEntry.postFetchInit(LNLogEntry.java:412) at com.sleepycat.je.txn.TxnChain. (TxnChain.java:133) at com.sleepycat.je.txn.TxnChain. (TxnChain.java:84) at com.sleepycat.je.recovery.RollbackTracker$RollbackPeriod.getChain(RollbackTracker.java:1009) at com.sleepycat.je.recovery.RollbackTracker$Scanner.rollback(RollbackTracker.java:483) at com.sleepycat.je.recovery.RecoveryManager.undoLNs(RecoveryManager.java:1182) ... 11 more