FAQ - Berkeley DB Java Edition
updated: November 4, 2009
General
· Can Berkeley DB Java Edition use a NFS, SAN, or other remote/shared/network filesystem for
an environment?
· Can a Berkeley DB database
be used by Berkeley DB Java Edition?
· Does JE support high
performance LOBs (large objects)?
· Does JE support key
prefixing or key compression?
· How can I set JE
configuration properties?
· How can insertion-ordered
records or sequences be used?
· How do I add fields to an
existing tuple format when using the Java bindings?
· How do I build a simple
Servlet using Berkeley DB Java Edition?
· How do I verify that the
configuration settings that I made in my je.properties file have taken
effect?
· How does JE Concurrent
Data Store (CDS) differ from JE Transactional Data Store (TDS)?
· Is a Berkeley DB database
the same as a SQL "table"?
· Is it considered best
practice to keep databases closed when not in use?
· What is the smallest
cache size I can set with JE?
· Why don’t
Berkeley DB and Berkeley DB Java Edition both implement a shared set of
Java interfaces for the API? Why are these two similar APIs in
different Java packages?
Installation and Build
· Does Berkeley DB Java Edition run within J2ME?
· Where does the je.jar file belong when loading within an application server?
· How should I set directory permissions on the JE environment directory?
Troubleshooting
· How do I debug a lock
timeout?
· NIO issues with JE
· What is a safe way to
stop threads in a JE application?
· Why does my application sometimes get a DbChecksumException when running on Windows 7?
Transactions
· JE 2.0 has support for XA
transactions in a J2EE app server environment. Can I use XA
transactions (2 phase commit) in a non-J2EE environment?
Querying
· How can I perform wildcard queries or non-key queries?
· Can I perform an efficient "keys only" query?
· How can I join two or more primary databases or indexes?
· How can a join be
combined with a range query?
· How do I perform a custom
sort of secondary duplicates?
· What is the best way to
access duplicate records when not using collections?
· What's the best way to
get a count of either all objects in a database, or all objects that
match a join criteria?
· How do I prevent "phantoms" when not using transactions?
Performance
· Which are better: Private
vs Shared Database instances?
· Are there any locking
configuration options?
· How can I estimate my
application's optimal cache size?
· How can I tune JE's cache management policy for my application's access pattern?
· How do I begin tuning
performance?
· In JE, what are the
performance tradeoffs when storing to more than one database?
· Is a larger cache always
better for JE?
· What are JE read buffers
and when should I change their size?
· What are JE write buffers
and when should I change their size?
· Why is my application
performance slower with transactions?
· How can I improve performance of a cursor walk over a database?
· What JVM parameters should I consider when tuning an application with a large cache size?
Disk
Space Utilization
· What is so different
about JE log files?
· How can I check the disk
space utilization of my log files?
· How can I find the location of
checkpoints within the log files?
Java
Collections
· Earlier versions of the
Java collections API required that iterators be explicitly closed. How
can they be used with other components that do not close iterators?
· How do I access
duplicates using the Java collections API?
· In earlier versions of
the Java collections API, why did iterators need to be explicitly
closed by the caller?
Direct Persistence Layer (DPL)
· What is the complete definition of object persistence?
· How do I define primary keys, secondary keys and composite keys?
· How do I store and query data in an EntityStore?
· How are relationships defined and accessed?
· What is the difference between embedding a persistent object and a relationship with another entity object?
· Why must all persistent classes (superclasses, subclasses and embedded classes) be annotated?
· Why doesn't the DPL use the standard annotations defined by the Java Persistence API?
· How do I dump and load the data in an EntityStore?
· What is Carbonado and when should I use it instead of the Direct Persistence
Layer (DPL)?
JE High Availability (JE HA)
· Configuration checklist
Can Berkeley DB Java Edition use a
NFS, SAN, or other remote/shared/network filesystem for an environment?
There are two caveats with NFS based storage, although the motivation for
them in Java Edition (JE) is different from that of Berkeley DB.
First, JE requires that the underlying storage system reliably persist
data to the operating system level when write() is called
and durably when fsync() is called. However, some remote
file system server implementations will cache writes on the server
side (as a performance optimization) and return to the client (in this
case JE) before making the data durable. While this is not a problem
when the environment directory's disk is local, this can present
issues in a remote file system configuration because the protocols are
generally stateless. The problem scenario can occur if (1) JE
performs a write() call, (2) the server accepts the data
but does not make it durable by writing it to the server's disk, (3)
the server returns from the write() call to the client,
and then (4) the server crashes. If the client (JE) does not know
that the server has crashed (the protocol is stateless), and then JE
later successfully calls write() on a piece of data later
in the log file, it is possible for the JE log file to have holes in
it, causing data corruption.
In JE 3.2.65 and later releases, a new parameter has been added,
called je.log.useODSYNC, which causes the JE environment
log files to be opened with the O_DSYNC flag. This flag
causes all writes to be written durably to the disk. In the case of a
remote file system it tells the server not to return from the
write() call until the data has been made durable on the
server's local disk. The flag should never be used in a local
environment configuration since it incurs a performance penalty.
Conversely, this flag should always be used in a remote file system
configuration or data corruption may result.
When using JE in a remote file system configuration, the system should
never be configured with multiple file system clients (i.e. multiple
hosts accessing the file system server). In this configuration it is
possible for client side caching to occur which will allow the log
files to become out of sync on the clients (JE) and therefore corrupt.
The only solution we know of for this is to open the environment log
files with the O_DIRECT flag, but this is not available
using the Java VM.
Second, Java Edition (JE) uses the file locking functionality provided
through java.nio.channels.FileChannel.lock(). Java does
not specify the underlying implementation, but presumably in many
cases it is based on the flock() system call. Whether
flock() works across NFS is platform dependent. A web
search shows several bug reports about its behavior on Linux where
flock() is purported to incorrectly return a positive
status.
JE uses file locking for two reasons:
-
To guarantee that only one writer process is attached to an
environment, and that all other processes are read-only. (Note that
JE supports multiple writer threads in a single process.)
-
To guarantee that any file deletions that result from log
cleaning are disabled as long as there is a reader process
attached. This FAQ provides more
information about log cleaning in the JE storage system.
Of course the simplest way of dealing with flock() vs NFS
is to only use a single process to access a JE environment. If that
is not possible, and if you cannot rely on flock() across
NFS on your systems, you could handle (1) by taking responsibility in
your application to ensure that there is a single writer process
attached. Having two writer processes in a single environment could
result in database corruption. (Note that the issue is with processes,
and not threads.)
Handling the issue of log cleaning (2) in your application is also
possible, but more cumbersome. To do so, you must disable the log
cleaner (by setting the je.env.runCleaner property to
false) whenever there are multiple processes accessing an
Environment. If file deletion is not locked out
properly, the reader processes might periodically see a
com.sleepycat.je.log.LogFileNotFoundException, and would
have to close and reopen to get a fresh snapshot. Such an exception
might happen very sporadically, or might happen frequently enough to
make the setup unworkable. To perform a log cleaning, the application
should first ensure that all reader processes have closed the
Environment (i.e. all read-only processes have closed all
Environment handles). Once closed, the writer process
should perform log cleaning by calling
Environment.cleanLog() and
Environment.checkpoint(). Following the completion of
the checkpoint, the reader processes can re-open the environment.
Back
to top
Can a Berkeley DB database be used by
Berkeley DB Java Edition?
We've had a few questions about whether data
files can be shared between Berkeley DB and Berkeley DB Java Edition.
The answer is that the on disk format is different for the two
products, and data files cannot be shared between the two. Both
products do share the same format for the data dump and load utilities (com.sleepycat.je.util.DbDump,
com.sleepycat.je.util.DbLoad), so you can import and
export data between the two products.
Also, JE data files are platform independent, and
can be moved from one machine to another.
Lastly, both products both support the Direct Persistence
Layer API, the persistent Java Collections API and a similar byte
array based API.
Back
to top
Does JE support high performance LOBs
(large objects)?
JE supports get() and put() operations with
partial data. However, this feature is not fully optimized, since the
entire record is always read or written to the database, and the entire
record is cached.
So the only advantage (currently) to using
partial get() and put() operations is that only a portion of the record
is copied to or from the buffer supplied by the application. In the
future we may provide optimizations of this feature, but until then we
cannot claim that JE has high performance LOB support.
For more information on partial get() and put()
operations please see our documentation.
Back
to top
Does JE support key prefixing or key
compression?
Key prefixing is a database storage technique which reduces the
space used to store B-Tree keys. It is useful for applications with
large keys that have similar prefixes. JE supports key prefixing
as of version 3.3.62. See
DatabaseConfig.setKeyPrefixing.
JE also does not currently support key
compression. While we have thought about it for both the DB and JE
products, there are issues with respect to the algorithm that is used,
the size of the key, and the actual values of the key. For example,
LZW-style compression works well, but needs a lot of bytes to compress
to be effective. If you're compressing individual keys, and they're
relatively small, LZW-style compression is likely to make the key
bigger, not smaller.
Back
to top
How can I set JE configuration
properties?
JE configuration properties can be programmatically specified
through Base API (com.sleepycat.je) classes such as
EnvironmentConfig, DatabaseConfig, StatsConfig,
TransactionConfig, and CheckpointConfig. When
using the replication (com.sleepycat.je.rep) package,
ReplicatedEnvironment properties may be set using
ReplicationConfig. When using the DPL
(com.sleepycat.persist) package,
EntityStore configuration properties may be set using
StoreConfig. The application instantiates one of
these configuration classes and sets the desired values.
For Environment and
ReplicatedEnvironment configuration properties,
there's a second configuration option, which is the
je.properties file. Any property set though the
get/set methods in the EnvironmentConfig and
ReplicationConfig classes can also be specified by
creating a je.properties file in the environment home
directory. Properties set through je.properties take
precedence, and give the application the option of changing
configurations without recompiling the application. All properties
that can be specified in je.properties can also be set
through EnvironmentConfig.setConfigParam or
ReplicationConfig.setConfigParam.
The complete set of Environment and
ReplicatedEnvironment properties are documented in the
EnvironmentConfig and ReplicationConfig
classes. The javadoc for each property describes the allowed
values, default value and whether whether the property is mutable.
Mutable properties can be changed after the environment open.
Properties not documented in these classes are experimental and
some may be phased out over time, while others may be promoted and
documented.
Back
to top
How can insertion-ordered records or
sequences be used?
The general capability for assigning IDs is a
"sequence", and has the same functionality as a SQL SEQUENCE. The idea
of a sequence is that it allocates values efficiently (without causing
a performance bottleneck), and guarantees that the same value won't be
used twice.
When using the DPL, the
@PrimaryKey(sequence="...") annotation may be used to
define a sequence. When using the Base API, the
Sequence class provides a lower level form of sequence
functionality, and an example case is in
<jeHome>/examples/je/SequenceExample.java.
Back
to top
How do I add fields to an existing
tuple format when using the Java bindings?
If you are currently storing objects using a
TupleBinding, it is possible to add fields to the tuple without
converting your existing databases and without creating a data
incompatibility. Please note also that class evolution is supported
without any application level coding through the Direct Persistence
Layer API.
This excerpt from the Javadoc may be made to
tuple bindings.
Collections Overview
The tuple binding uses less space and executes
faster than the serial binding. But once a tuple is written to a
database, the order of fields in the tuple may not be changed and
fields may not be deleted. The only type evolution allowed is the
addition of fields at the end of the tuple, and this must be explicitly
supported by the custom binding implementation.
Specifically, if your type changes are limited to
adding new fields then you can use the TupleInput.available()
method to check whether more fields are available for reading. The available()
method is the implementation of java.io.InputStream.available().
It returns the number of bytes remaining to be read. If the return
value is greater than zero, then there is at least one more field to be
read.
When you add a field to your database record
definition, in your TupleBinding.objectToEntry
method you should unconditionally write all fields including the
additional field.
In your TupleBinding.entryToObject
method you should call available() after
reading all the original fixed fields. If it returns a value greater
than zero, you know that the record contains the new field and you can
read it. If it returns zero, the record does not contain the new field.
For example:
public Object entryToObject(TupleInput input) {
// Read all original fields first, unconditionally.
if (input.available() > 0) {
// Read additional field #1
}
if (input.available() > 0) {
// Read additional field #2
}
// etc
}
Back
to top
How do I build a simple Servlet using
Berkeley DB Java Edition?
Below is a simple Servlet example that uses JE.
It opens a JE Environment in the init method and then reads all the
data out of it in the doGet() method.
import java.io.*;
import java.text.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;
import com.sleepycat.je.Cursor;
import com.sleepycat.je.Database;
import com.sleepycat.je.DatabaseConfig;
import com.sleepycat.je.DatabaseEntry;
import com.sleepycat.je.DatabaseException;
import com.sleepycat.je.Environment;
import com.sleepycat.je.EnvironmentConfig;
import com.sleepycat.je.LockMode;
import com.sleepycat.je.OperationStatus;
/**
* The simplest possible servlet.
*/
public class HelloWorldExample extends HttpServlet {
private Environment env = null;
private Database db = null;
public void init(ServletConfig config)
throws ServletException {
super.init(config);
try {
openEnv("c:/temp");
} catch (DatabaseException DBE) {
DBE.printStackTrace(System.out);
throw new UnavailableException(this, DBE.toString());
}
}
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException {
ResourceBundle rb =
ResourceBundle.getBundle("LocalStrings",request.getLocale());
response.setContentType("text/html");
PrintWriter out = response.getWriter();
out.println("<html>");
out.println("<head>");
String title = rb.getString("helloworld.title");
out.println("<title>" + title + "</title>");
out.println("</head>");
out.println("<body bgcolor=\"white\">");
out.println("<a href=\"../helloworld.html\">");
out.println("<img src=\"../images/code.gif\" height=24 " +
"width=24 align=right border=0 alt=\"view code\"></a>");
out.println("<a href=\"../index.html\">");
out.println("<img src=\"../images/return.gif\" height=24 " +
"width=24 align=right border=0 alt=\"return\"></a>");
out.println("<h1>" + title + "</h1>");
dumpData(out);
out.println("</body>");
out.println("</html>");
}
public void destroy() {
closeEnv();
}
private void dumpData(PrintWriter out) {
try {
long startTime = System.currentTimeMillis();
out.println("<pre>");
Cursor cursor = db.openCursor(null, null);
try {
DatabaseEntry key = new DatabaseEntry();
DatabaseEntry data = new DatabaseEntry();
while (cursor.getNext(key, data, LockMode.DEFAULT) ==
OperationStatus.SUCCESS) {
out.println(new String(key.getData()) + "/" +
new String(data.getData()));
}
} finally {
cursor.close();
}
long endTime = System.currentTimeMillis();
out.println("Time: " + (endTime - startTime));
out.println("</pre>");
} catch (DatabaseException DBE) {
out.println("Caught exception: ");
DBE.printStackTrace(out);
}
}
private void openEnv(String envHome)
throws DatabaseException {
EnvironmentConfig envConf = new EnvironmentConfig();
env = new Environment(new File(envHome), envConf);
DatabaseConfig dbConfig = new DatabaseConfig();
dbConfig.setReadOnly(true);
db = env.openDatabase(null, "testdb", dbConfig);
}
private void closeEnv() {
try {
db.close();
env.close();
} catch (DatabaseException DBE) {
}
}
}
Back
to top
How do I verify that the configuration
settings that I made in my je.properties file have taken effect?
You can use the Environment.getConfig()
API to retrieve configuration information after the Environment has
been created. For example:
import java.io.File;
import com.sleepycat.je.*;
public class GetParams {
static public void main(String argv[])
throws Exception {
EnvironmentConfig envConfig = new EnvironmentConfig();
envConfig.setTransactional(true);
envConfig.setAllowCreate(true);
Environment env = new Environment(new File("/temp"), envConfig);
EnvironmentConfig newConfig = env.getConfig();
System.out.println(newConfig.getCacheSize());
env.close();
}
}
will display
> java GetParams
7331512
>
Note that you have to call getConfig(),
rather than query the EnvironmentConfig that
was used to create the Environment.
Back
to top
How does JE Concurrent Data Store
(CDS) differ from JE Transactional Data Store (TDS)?
Berkeley DB, Java Edition comes in two flavors,
Concurrent Data Store (CDS) and Transactional Data Store (TDS). The
difference between the two products lies in whether you use
transactions or not. Literally speaking, you are using TDS if you call
the public API method, EnvironmentConfig.setTransactional(true).
Both products support multiple concurrent reader
and writer threads, and both create durable, recoverable databases.
We're using "durability" in the database sense, which means that the
data is persisted to disk and will reappear if the application comes
back up after a crash. What transactions provide is the ability to
group multiple operations into a single, atomic element, the ability to
undo operations, and control over the granularity of durability.
For example, suppose your application has a two
databases, Person and Company. To insert new data, your application
issues two operations, one to insert into Person, and another to insert
into Company. You need transactions if your application would like to
group those operations together so that the inserts only take effect if
both operations are successful.
Note that an additional issue is whether you use
secondary indices in JE. Suppose you have a secondary index on the
address field in Person. Although it only takes one method call into JE
to update both the Person database and its secondary index Address, the
application needs to use transactions to make the update atomic.
Otherwise, it's possible that if the system crashed at given point,
Person could be updated but not Address.
Transactions also let you explicitly undo a set
of operations by calling Transaction.abort(). Without transactions, all
modifications are final after they return from the API call.
Lastly, transactions give you finer grain
durability. After calling Transaction.commit, the modification is
guaranteed to be durable and recoverable. In CDS, without transactions,
the database is guaranteed to be durable and recoverable back to the
last Environment.sync() call, which can be an expensive operation.
Note that there are different flavors of
Transaction.commit that let you trade off levels of durability and
performance, explained in this FAQ
entry.
So in summary, choose CDS when:
- Transactional data protection is not required.
- The application does not need a guarantee that
secondary indices are consistent with primary indices.
- The application does not need fine grained
durability.
Choose TDS when:
- Full transactional semantics, including the
ability to transactionally protect groups of operations, is a
requirement.
- Recovery of committed data is a requirement.
- Secondary indices are used and must be
guaranteed to be in sync with primary indices.
There is a single download and jar file for both
products. Which one you use is a licensing issue, and has no
installation impact.
Back
to top
Is a Berkeley DB database the same as
a SQL "table"?
Yes; "tables" are databases, "rows" are key/data
pairs, and "columns" are application-encapsulated fields. The
application must provide its own methods for accessing a specific
field, or "column" within the data value.
Back
to top
Is it considered best practice to keep
databases closed when not in use?
The memory overhead for keeping a database open
is quite small. In general, it is expected that applications will keep
databases open for as long as the environment is open. The exception
may be an application that has a very large number of databases and
only needs to access a small subset of them at any one time.
If you notice that your application is short on
memory because you have too many databases open, then consider only
opening those you are using at any one time. Pooling open database
handles could be considered at that point, if the overhead of opening
databases each time they are used has a noticeable performance impact.
Back
to top
What is so different about JE log
files?
See
Six Things Everyone Should Know about JE Log Files.
You'll find more about log files in the Getting Started Guide.
Back
to top
What is the smallest cache size I can
set with JE?
The smallest cache size is 96KB (96
* 1024). You can set this by either calling EnvironmentConfig.setCacheSize(96
* 1024) on the EnvironmentConfig
instance that you use to create your environment, or by setting the je.maxMemory
property in your je.properties
file.
Back
to top
Why don’t Berkeley DB and
Berkeley DB Java Edition both implement a shared set of Java interfaces
for the API? Why are these two similar APIs in different Java packages?
In the past, we've discussed whether it makes
sense to provide a set of interfaces that are implemented by the
Berkeley DB JE API and the Java API for Berkeley DB. We looked into
this during the design of JE and decided against it because in general
it would complicate things for "ordinary" users of both JE and DB.
- There are architectural differences between
the two products that require that applications are cognizant of which
product is being used. For example, in DB database names consist of a
(filename, database name) pair; since JE doesn't have database files,
only a database name is required.
- The constants classes like LockMode and
OperationStatus could be faked in a common package, but there is no
guarantee that they will be the same between products and releases.
- Classes that applications construct directly
(like DatabaseEntry) are also problematic: we could have common
interfaces and a factory in a common package, but that doesn't allow
for subclassing and presents problems for callbacks like
SecondaryKeyCreate.
- The exception classes are more problematic:
even if we moved DatabaseException into the common package (breaking
applications in the process), applications using the common interfaces
would need to explicitly catch exceptions from both packages.
Otherwise, we would need to unify what exceptions are thrown from DB
and JE, and given that DB exceptions are generated based on C error
codes, there is no way we would ever get that right.
Back
to top
Does Berkeley DB Java Edition run
within J2ME?
JE requires Java SE 1.5 or later. There are no
plans to support J2ME at this time.
Back
to top
Where does the je.jar file belong when
loading within an application server?
It is important that je.jar and your application
jar files—in particular the classes that are being serialized
by SerialBinding—are loaded under the same class loader. For
running in a servlet, this typically means that you would place je.jar
and your application jars in the same directory.
Additionally, it is important to not place je.jar
in the extensions directory for your JVM. Instead place je.jar in the
same location as your application jars. The extensions directory is
reserved for privileged library code.
One user with a WebSphere Studio (WSAD)
application had a classloading problem because the je.jar was in both
the WEB-INF/lib and the ear project. Removing the je.jar from the ear
project resolved the problem.
Back
to top
How do I debug a lock timeout?
The common cause of a
com.sleepycat.je.LockConflictException is the
situation where 2 or more transactions are deadlocked because
they're waiting on locks that the other holds. For example:
- transaction 1 has a lock on record A, and
wants a (exclusive) write lock on record B.
- transaction 2 has a lock on record B, and
wants a (exclusive) write lock on record A.
The lock timeout message may give you insight into the nature of the
contention. Besides the default timeout message, which lists the
contending lockers, their transactions, and other waiters, it's also
possible to enable tracing that will display the stacktraces of where
locks were acquired.
Stacktraces can be added to a deadlock message by
setting the je.txn.deadlockStackTrace property through your
je.properties file or EnvironmentConfig. This should only be set during
debugging because of the added memory and processing cost.
Enabling stacktraces gives you more information
about the target of contention, but it may be necessary to also examine
what locks the offending transactions hold. That can be done through
your application's knowledge of current activity, or by setting the
je.txn.dumpLocks property. Setting je.txn.dumpLocks will make the
deadlock exception message include a dump of the entire lock table, for
debugging. The output of the entire lock table can be large, but is
useful for determining the locking relationships between records.
Another note, which doesn't impact the deadlock
itself, is that the default setting for lock timeouts (specified by
je.lock.timeout or EnvironmentConfig.setLockTimeout() can be too long
for some applications with contention, and throughput improves when
this value is decreased. However, this issue only affects performance,
not true deadlocks.
Back
to top
NIO issues with JE
In JE 4.0 and later releases on the 4.x.y line,
and JE 3.3.92 and later
releases on the 3.3.x line, the NIO parameters je.log.useNIO,
je.log.directNIO,
and je.log.chunkedNIO are deprecated.
Setting them has no affect.
In JE 3.3.91 and earlier, the NIO parameters are functional,
but should never be used since they are now known to cause data corrupting bugs
in JE.
Back
to top
What is a safe way to stop threads in
a JE application?
Calling Thread.interrupt()
is not recommended for an active JE thread if the goal is to stop the
thread or do thread coordination. If you interrupt a thread which is
executing a JE operation, the state of the database will be undefined.
That's because JE might have been in the middle of I/O activity when
the operation was aborted midstream, and it becomes very difficult to
detect and handle all possible outcomes.
If JE can detect the interrupt, it will mark the
environment as unusable and will throw a RunRecoveryException.
This tells you that you must close the environment and re-open it
before using it again. If JE doesn't throw RunRecoveryException,
it is very likely that you would get some other exception that is less
meaningful, or simply see corrupted data.
Instead, applications should use other mechanisms
like Object.notify() and wait() to coordinate threads. For example, use
a "keepRunning" variable of some kind in each thread. Check this
variable in your threads, and return from the thread when it is false.
Set it to false when you want to stop the thread. If this thread is
waiting to be woken up to do another unit of work, use Object.notify to
wake up it. This is the recommended technique.
If you absolutely must interrupt threads for some
reason, you should expect that you will see RunRecoveryException.
Each thread should treat this exception as an indication that it should
stop.
Back
to top
Why does my application sometimes get a DbChecksumException when running on Windows 7?
We believe that Windows 7 (as of build 7600) has an IO bug
which is triggered by JE 3.3.91 and earlier under certain conditions.
Unfortunately, this bug causes file corruption which is eventually detected
by JE's checksumming mechanism. JE 4.0 (and later) has a "write queue"
mechanism built into it which prevents this bug from being triggered. Because
JE 3.3.91 (and earlier releases) were shipped before Windows 7, we were not
aware of the bug and therefore not able to include preventive code. JE 3.3.92
detects if it is running on Windows 7 and prevents triggering the bug. We have
reported this bug to Microsoft. [#17865]
Back
to top
JE 2.0 has support for XA transactions
in a J2EE app server environment. Can I use XA transactions (2 phase
commit) in a non-J2EE environment?
Yes. The com.sleepycat.je.XAEnvironment
class implements the javax.transaction.xa.XAResource interface, which
can be used to perform 2 phase commit transactions. The relevant
methods in this interface are start(), end(), prepare(),
commit(), and rollback(). The XAEnvironment.setXATransaction()
is an internal entrypoint that is only public for the unit tests.
The XA Specification has the concept of implicit
transactions (a transaction that is associated with a thread and does
not have to be passed to the JE API); this is supported in JE 2.0. You
can use the XAResource.start() method to
create a JE transaction and join it to the calling thread. To
disassociate a transaction from a thread, use the end()
method. When you use thread-implied transactions, you do not have to
pass in a Transaction argument to the JE API (e.g. through methods such
as get() and put()).
Instead, passing null in a thread-implied transaction environment tells
JE to use the implied transaction.
Here's a small example of how to use
XAEnvironment and 2 Phase Commit:
XAEnvironment env = new XAEnvironment(home, null); Xid xid = [something...];
env.start(xid, 0); // creates a thread-implied transaction for you
... calls to get/put, etc. with null transaction arg will use the
implicit transaction...
env.end(xid, 0); // disassociate this thread from the implied
transaction
env.prepare(xid);
if (commit) {
env.commit(xid, false);
} else {
env.rollback(xid);
}
Your application must create an Xid
class that implements javax.transaction.xa.Xid
in order to create a transaction identifier.
Back
to top
How can a join be combined with a
range query?
Imagine an application where a single primary
employee database has three fields that are indexed by secondary
databases: status, department, salary. The user wishes to query for a
specific status, a specific department, and range of salary values.
Berkeley DB supports joins, and the join API can be used to select the
AND (intersection) of a specific status and a specific department.
However, the join API cannot be use to select a range of salaries.
Berkeley DB also supports range searches, making it possible to iterate
over a range of values using a secondary index such as a salary index.
However, there is no way to automatically combine a range search and a
join.
To combine a range search and a join you'll need
to first perform one of the two using a Berkeley DB API, and then
perform the other manually as a "filter" on the results of the first.
So you have two choices:
- Perform the range query using the Berkeley DB
API. Iterate through the range, and manually select (filter) the
records that meet your join qualifications.
- Perform the join using the Berkeley DB API.
Iterate through the results of the join, and manually select the
records that meet your range search qualifications.
Which option performs best depends on whether the
join or the range query will produce a smaller result set, on average.
If the join produces a smaller result set, use option 2; otherwise, use
option 1. There is a 3rd option to consider if this particular query is
performance critical. You could create a secondary index on preassigned
ranges of salary For example, assign the secondary key 1 for salaries
between $10,000 and $19,999, key 2 for salaries between $20,000 and
$29,999, etc.
If a query specifies only one such salary range,
you can perform a join using all three of your secondary indices, with
no filtering after the join. If the query spans ranges, you'll have to
do multiple joins and then union the results. If the query specifies a
partial range, you'll have to filter out the non-matching results. This
may be quite complex, but it can be done if necessary. Before
performing any such optimization, be sure to measure performance of
your queries to make sure the optimization is worthwhile.
If you can limit the specified ranges to only
those that you've predefined, that will actually simplify things rather
than make them more complex, and will perform very well also. In this
case, you can always perform a single join with no filtering. Whether
this is practical depends on whether you can constrain the queries to
use predefined ranges.
On range searches in general, they can be done
with Cursor.getSearchKeyRange or with the SortedSet.subSet and
SortedMap.subMap methods, depending on whether you are using the base
API or the Collections API. It is up to you which to use.
If you use Cursor.getSearchKeyRange you'll need
to call getNext to iterate through the results. You'll have to watch
for the end range yourself by checking the key returned by getNext.
This API does not have a way to enforce range end values automatically.
If you use the Collections API you can call
subMap or subSet and get an Iterator on the resulting collection. That
iterator will enforce both the beginning and the end of the range
automatically.
Back
to top
How do I perform a custom sort of
secondary duplicates?
If you have a secondary database with sorted
duplicates configured, you may wish to sort the duplicates according to
some other field in the primary record. Let's say your secondary key is
F1 and you have another field in your primary record, F2, that you wish
to use for ordering duplicates. You would like to use F1 as your
secondary key, with duplicates ordered by F2.
In Berkeley DB, the "data" for a secondary
database is the primary key. When duplicates are allowed in a
secondary, the duplicate comparison function simply compares those
primary key values. Therefore, a duplicate comparison function cannot
be used to sort by F2, since the primary record is not available to the
comparison function.
The purpose of key and duplicate comparison
functions in Berkeley DB is to allow sorting values in some way other
than simple byte-by-byte comparison. In general it is not intended to
provide a way to order keys or duplicates using record data that is not
present in the key or duplicate entry. Note that the comparison
functions are called very often—whenever any Btree operation
is performed—so it is important that the comparison be fast.
There are two ways you can accomplish sorting by
F2:
- Instead of using F1 as the secondary key, use a
concatenated key F1+F2 as the secondary key. When you wish to do a
lookup by F1, use arrange search (Cursor.getSearchKeyRange).
- Use F1 as the secondary key, as you are already
doing. When you query the duplicates for F1, sort them manually by F2
after you query them. Since when you query the secondary you will have
the primary record in hand, F2 will be available for sorting.
Option #1 has the advantage of automatically
sorting by F2. However, you will never be able to do a join (via the
Database.join method) on the F1 key alone. You will be able to do a
join on the F1+F2 value, but it seems unlikely that will be useful.
Secondaries are often used for joins. Therefore,
we recommend option #2 unless you are quite sure that you won't need to
do a join on F1.
The trade-offs are:
- Option #1 does not allow performing join on F1.
- Option #1 has larger secondary keys (more
overhead).
- Option #2 requires programming the sort by F2
manually.
- Option #2 requires enough memory to sort all
duplicates for a given key F1.
Back
to top
What is the best way to access
duplicate records when not using collections?
Duplicate records are records that are in a single database and
have the same key. Since there is more than one record per key,
a simple lookup by key is not sufficient to find all duplicate
records for that key.
When using the DPL, the SecondaryIndex.subIndex
method is the simplest way to access duplicate records.
When using the Base API, you need to position a cursor at the
desired key, and then retrieve all the subsequent duplicate
records. The Getting Started Guide has a good section on how to
position your cursor:
Search For Records
and then how to retrieve the rest of the duplicates:
Working with Duplicate Records.
Back
to top
What's the best way to get a count of
either all objects in a database, or all objects that match a join
criteria?
As with most Btree based data stores, Berkeley DB
Java Edition does not store record counts for non-duplicate records, so
some form of internal or application based traversal is required to get
the size of the result set. This is in general true of relational
databases too; it's just that the count is done for you internally when
the SQL count statement is executed. Berkeley DB Java Edition version
3.1.0 introduced a Database.count() method, which returns the number of
all key/data pairs in the database. This method does an optimized,
internal traversal, does not impact the working set in the cache, but
may not be accurate in the face of concurrent modifications in the
database.
To get a transactionally current count, or to
count the result of a join, do this:
cursor = db.openCursor(...) OR db.join(someCursors);
count = 0;
while(cursor.getNext(...) == OperationStatus.SUCCESS) {
count++;
}
There are a few ways to optimize an
application-implemented count:
- Counts are stored for duplicate record sets.
One can optimize counts for databases with duplicate records by using
Cursor.count()
which returns the number of records that share the current record's key
value. Suppose you want to count all the records in a database that
supports duplicates and contains 3000 records, but only 3 distinct
keys. In that case, it would be far more efficient to do:
count = 0;
while (cursor.getNextNoDup(...) == OperationStatus.SUCCESS) {
count += cursor.count();
}
because you will only look up 3 records (one for each key value), not
3000 records.
- If your database has large records, another
option when counting is to set
DatabaseEntry.setPartial(0, 0,
true) on the key and data DatabaseEntry to reduce the
overhead of returning large records.
- If you do not need a transactional count, using
LockMode.READ_UNCOMMITTED will be faster,
especially in combination with calling
DatabaseEntry.setPartial(0, 0, true) for the data
entry. When read-uncommitted is used and no data is returned,
only the key is read and much less I/O is performed. When
using the DPL, the same thing can be accomplished by using
LockMode.READ_UNCOMMITTED in combination with
EntityIndex.keys(). The DPL keys
method calls DatabaseEntry.setPartial(0, 0, true)
for you.
- If you are counting the entire database and
Database.count() is not supported in your JE version, you can also use
the value of
Btree.getLeafNodeCount(),
obtained from Database.getStats() under
certain circumstances. This returns a valid count of the number of
records in the database, but because it is obtained without locks or
transactions the count is only correct when the database is quiescent.
In addition, although stats generation takes advantage of some internal
code paths, it may consume more memory when analyzing large databases.
Back
to top
Which are better: Private vs Shared
Database instances?
Using a single Database instance for multiple threads is
supported and, as of JE 4.0, has no performance drawbacks.
In JE 3.3 and earlier, using a single Database instance for
multiple threads presented a minor bottleneck. The issue is that
the Database object maintains a set of Cursors open against it.
This set is used to check if all Cursors are closed against the
Database when close() is called, but to do that JE has to
synchronize against it before updating it. So if multiple threads
are sharing the same Database handle it makes for a synchronization
bottleneck. In a multi-threaded case, unless there's a good reason
to share a Database handle, it's probably better to use separate
handles for each thread.
Back
to top
Are there any locking configuration
options?
JE 2.1.30 introduced two new performance
motivated locking options.
No-locking mode is on one end of the spectrum.
When EnvironmentConfig.setLocking(false) is
specified, all locking is disabled, which relieves the application of
locking overhead. No-locking should be used with care. It's only valid
in a non-transactional environment and the application must ensure that
there is no concurrent activity on the database. Concurrent activity
while in no-locking mode can lead to database corruption. In addition,
log cleaning is disabled in no-locking mode, so the application is
responsible for managing log cleaning through explicit calls to the Environment.cleanLog()
method.
On the other end of the spectrum is the je.lock.nLockTables
property, which can specify the number of lock tables. While the
default is 1, increasing this number can improve multithreaded
concurrency. The value of this property should be a prime number, and
should ideally be the nearest prime that is not greater than the number
of concurrent threads.
Back
to top
How can I estimate my application's
optimal cache size?
The
com.sleepycat.je.util.DbCacheSize utility takes
record size and number of records for a given database and provides an
estimate of its in-memory footprint.
Even though DbCacheSize does not
yet support the use of duplicate records in a database it is still
a useful aid for general estimates.
A good starting point is to invoke DbCacheSize with the parameters:
-records <count> # Total records (key/data pairs); required
-key <bytes> # Average key bytes per record; required
[-data <bytes>] # Average data bytes per record; if omitted no leaf
# node sizes are included in the output
Note that DbPrintLog -S gives the average record size under
Log statistics, in the LN (leaf node) row, at the avg bytes column.
To measure the cache size for a 64-bit JVM, DbCacheSize needs to be
run on the 64-bit JVM.
Back
to top
How can I tune JE's cache management
policy for my application's access pattern?
JE, like most databases, performs best when
database objects are found in its cache. The cache eviction algorithm
is the way in which JE decides to remove objects from the cache and can
be a useful policy to tune. The default cache eviction policy is LRU
(least recently used) based. Database objects that are accessed most
recently are kept within cache, while older database objects are
evicted when the cache is full. LRU suits applications where the
working set can stay in cache and/or there are some data records are
used more frequently than others.
An alternative cache eviction policy was added in
JE 2.0.83 that is instead primarily based on the level of the node in
the Btree. This level based algorithm can improve performance for some
applications with both of the following characteristics:
- Access by key is mostly random. In other words,
there are few or no "hot" sets of records that are accessed more than
others, and most record access is not in sequential key order.
- The memory size of the active record set is
significantly larger than the configured JE cache size, causing lots of
I/O as records are fetched from disk.
The alternative cache eviction policy is specified by setting two
configuration parameters to your je.properties file or
EnvironmentConfig object:
je.evictor.lruOnly=false je.evictor.nodesPerScan=100
The level based algorithm works by evicting the lowest level nodes of
the Btree first, even if higher level nodes are less recently used. In
addition, dirty nodes are evicted after non-dirty nodes. This algorithm
can benefit random access applications because it keeps higher level
Btree nodes in the tree for as long as possible, which for a random
key, can increase the likelihood that the relevant Btree internal nodes
will be in the cache.
We recommend that you also change the
nodesPerScan property when you set lruOnly to false. This setting
controls the number of Btree nodes that are considered, or sampled,
each time a node is evicted. We have found in our tests that a setting
of 100 produces good results. The larger the nodesPerScan, the more
accurate the algorithm. However, don't set it too high. When
considering larger numbers of nodes for each eviction, the evictor may
delay the completion of a given database operation, which impacts the
response time of the application thread.
Back
to top
How do I begin tuning performance?
Gathering environment statistics is a useful
first step to doing JE performance tuning. Execute the following code
snippet periodically to display statistics for the past period and and
to reset statistics counters for the next display.
StatsConfig config = new StatsConfig(); config.setClear(true);
System.err.println(env.getStats(config));
The Javadoc for com.sleepycat.je.EnvironmentStats
describes each field. Cache behavior can have a major effect on
performance, and nCacheMiss is an indicator of how hot the cache is.
You may want to adjust the cache size, data access pattern, or cache
eviction policy and monitor nCacheMiss.
Applications which use transactions may want to
check nFSyncs to see how many of these costly system calls have been
issued. Experimenting with other flavors of commit durability, like
TxnWriteNoSync and TxnNoSync can improve performance.
nCleanerRuns and cleanerBacklog are indicators of
log cleaning activity. Adjusting the property je.cleaner.minUtilization
can increase or decrease log cleaning. The user may also elect to do
batch log cleaning, as described in the Javadoc for
Environment.cleanLog(), to control when log cleaning occurs.
High values for nRepeatFaultReads and
nRepeatIteratorReads may indicate non-optimal read buffer sizes. See
the FAQ entry on configuring read buffers.
Back
to top
In JE, what are the performance
tradeoffs when storing to more than one database?
A user posted a question about the pros and cons
of using multiple databases. The question was:
We are
designing a application where each system could handle at least 100
accounts. We currently need about 10 databases. We have three options
we are considering.
Option 1) Put all databases in the same environment. This would mean at
least 1000 databases in one environment. Does this cause a problem for
BDB JE? And how far is this scalable?
Option 2) Give each account their own environment. This would limit the
amount of databases per environment, but we would have 100+
environments on one system. Would 100+ environments consume to much of
the Java resources?
Option 3) One environment, 10 databases, and prefix all entry keys with
an account id. We prefer to avoid this option due to some of the
functionality we gain by having separate databases or environments.
With option 1/2 we can restore/move/rename one account easily. Don't
have to worry about keeping data separate since each accounts have
their own databases. However, if this provides the best performance we
would look at implementing it.
All three options are practical solutions using JE. Which
option is best depends on a number of trade-offs.
- Option 1 -- separate databases per account in a single
environment.
The data for each account is kept logically separate and
easy to manage. Databases can be efficiently renamed, truncated
and removed (see Environment.renameDatabase,
truncateDatabase and removeDatabase),
although this is not as efficient as directly managing the log
files, as with a separate environment (option 2). Copying a
database can be done with the DbDump and
DbLoad utilities, or with a custom utility.
With this option a single transaction can be used
for records in multiple accounts, since transactions may span
databases and a single environment is used for all accounts.
Secondary indices cannot span accounts, since
secondaries cannot span databases.
The cost of opening and closing accounts is mid-way between
option 2 and 3. Opening and closing databases is less
expensive than opening and closing environments.
The per-account overhead is lower than option 2, but higher
than option 3. The per-database disk overhead is about 3 to 5
KB. The per-database memory overhead is about the same but is
only incurred for open databases, so it can be be minimized by
closing databases that are not in active use. Note that prior
to JE 3.3.62 this memory overhead was not reclaimed when a
database was closed. For this reason, if large numbers of
databases are used then option 1 is not recommended with
releases prior to JE 3.3.62.
The checkpoint overhead is higher than option 3, because the
number of databases is larger. How much this overhead matters
depends on the data access pattern and checkpoint frequency.
This tradeoff is described following option 3 below.
If database names are used to identify accounts, another
issue is that Database.getDatabaseName does a
linear search of the records in the mapping tree and is slow. A
workaround is to store the name in your own application, with
the reference to the Database.
- Option 2 -- a separate environment per account.
The data for each account is kept physically separate and
easy to manage, since a separate environment directory exists
for each account. Deleting and copying accounts can be
performed as file system operations.
With this option a single transaction cannot be
used for records in multiple accounts, since transactions may
not span environments. Secondary indices cannot span
accounts, since secondaries cannot span databases or
environments.
The cost of opening and closing accounts is highest, because
opening and closing databases is less expensive than opening
and closing environments. Be sure to close environments
cleaning to minimize recovery time.
This option has the highest overhead per account, because of
the per-environment memory and disk space overhead, as well as
the background threads for each environment. In JE 3.3.62 and
later, a shared cache may be used for all environments. With
this option it is important to configure a shared cache (see
EnvironmentConfig.setSharedCache) to avoid a
multiplying effect on the cache size of all open environments.
For this reason, this option is not recommended with releases
prior to JE 3.3.62.
The checkpoint overhead is higher than option 3, because the
number of environments and databases is larger. How much this
overhead matters depends on the data access pattern and
checkpoint frequency. This tradeoff is described following
option 3 below.
- Option 3 -- a single environment, with accounts stored
together in the same database(s) with an account key prefix.
The data for each account is kept logically separate using
the key prefix, but accounts must be managed using custom
utilities that take this prefix into account. The DPL (see
com.sleepycat.persist) may be useful for this
option, since it makes it easy to use key ranges that are based
on a key prefix.
With this option a single transaction can be used
for records in multiple accounts, since a single environment is
used for all accounts. Secondary indices can also
span accounts, since the same database(s) are used for all
accounts.
The cost of opening and closing accounts is lowest (close to
zero), since neither databases nor environments are opened or
closed.
Because the number of databases and environments is
smallest, the per-account memory and disk overhead is lowest
with this option. Key prefixing should normally be configured
to avoid redundant storage of the account key prefix (see
DatabaseConfig.setKeyPrefixing).
The checkpoint overhead is lowest with this option, because
the number of environments and databases is smallest. How much
this overhead matters depends on the data access pattern and
checkpoint frequency. This tradeoff is described below.
It costs more to checkpoint the root of a database than other
portions. Whether this matters depends on how the application
accesses the database. For example, it costs marginally less to
insert 1000 records into 1 database than to insert 1 record into
1000 databases. However, it costs much less to checkpoint the
former rather than the latter. Suppose we update 1 record in 1000
databases. In a small test program, the checkpoint takes around 730
ms. Suppose we update 1000 records in 1 database. In the same
test, the checkpoint takes around 15 ms.
In addition, each environment has information that must be
checkpointed, which will make the total checkpoint overhead
somewhat larger in option 2, since each environment must be
checkpointed separately.
Back
to top
Is a larger cache always better for
JE?
In general, JE performs best when its working set
fits within cache. But due to the interaction of Java garbage
collection and JE, there can be scenarios when JE actually performs
better with a smaller cache.
JE caches items by keeping references to database
objects. To keep within the memory budget mandated by the cache size,
JE will release references to those objects and they will be garbage
collected by the JVM. Many JVMs use an approach called generational
garbage collection. Objects are categorized by age in order to apply
different collection heuristics. Garbage collecting items from the
younger space is cheaper and is done with a "partial GC" pass while
longer-lived items require a more expensive "Full GC".
If the application tends to access data records
that are rarely re-used, and the JE cache has
excessive capacity, the JE cache will become populated with data
records that are no longer needed by the application. These data
records will eventually age and the JVM will re-categorize them as
older objects, which then provokes more Full GC. If the JE cache is
smaller, JE itself will tend to dereference, or evict these once-used
records more frequently and the JVM will have younger objects to
garbage collect.
Garbage collection is really only an issue when
the application is CPU bound. To find this point of equilibrium, the
user can monitor EnvironmentStats.nCacheMisses and the application's
throughput. Reducing the cache to the smallest size where nCacheMisses
is 0 will show the optimal performance. Enabling GC statistics in the
JVM can help too. (In the Java SE 5 JVM this is enabled with
("-verbose:gc", "-XX+PrintGCDetails", "-XX:+PrintGCTimeStamps")
Back
to top
What are JE read buffers and when
should I change their size?
JE follows two patterns when reading items from
disk. In one mode a single database object, which might be a Btree node
or a single data record, is faulted in because the application is
executing a Database or Cursor operation and cannot find the item in
cache. In a second mode, JE will read large sequential portions of the
log on behalf of activities like environment startup or log cleaning,
and will read in one or multiple objects.
Single object reads use temporary buffers of a
size specified by je.log.faultReadSize while sequential reads use
temporary buffers of a size specified by je.log.iteratorReadSize. The
defaults for these properties are listed in
<jeHome>/example.properties, and are currently 2K and 8K
respectively.
The ideal read buffer size is as small as
possible to reduce memory consumption but is also large enough to
adequately fit in most database objects. Because JE must fit the whole
database object into a buffer when doing a single object read, a
too-small read buffer for single object reads can result in wasted,
repeated read calls. When doing sequential reading, JE can piece
together parts of a database object, but a too-small read buffer for
sequential reads may result in excessive copying of data. The
nRepeatFaultReads and nRepeatIteratorReads fields in EnvironmentStats
show the number of wasted reads for single and sequential object
reading.
If nRepeatFaultReads is greater than 0, the
application may try increasing the value of je.log.faultReadSize. If
nRepeatIteratorReads is greater than 0, the application may want to
adjust je.log.iteratorReadSize and je.log.iteratorMaxSize.
Back
to top
What are JE write buffers and when
should I change their size?
JE log files are append only, and all record
insertions, deletions, and modifications are added to the end of the
current log file. See the FAQ on What is so different about JE log files?
for more information.
New data is buffered in write log buffers before
being flushed to disk. As each log buffer is filled, a write system
call is issued. As each .jdb file reaches its maximum size, a fsync
system call is issued and a new .jdb file is created.
Increasing the write log buffer size and the JE
log file size can improve write performance by decreasing the number of
write and fsync calls. However, write log buffer size has to be
balanced against the total JE memory budget, which is represented by
the je.maxMemory, or EnvironmentConfig.getCacheSize(). It may be more
productive to use available memory to cache database objects rather
than write log buffers. Likewise, increasing the JE log file size can
make it harder for the log cleaner to effectively compress the log.
The number and size of the write log buffers is
determined by je.log.bufferSize, je.log.numBuffers, and
je.log.totalBufferBytes. By default, there are 3 write log buffers and
they consume 7% of je.maxMemory. The nLogBuffers and bufferBytes fields
in EnvironmentStats will show what the current settings are.
An application can experiment with the impact of
changing the number and size of write log buffers. A non-transactional
system may benefit by reducing the number of buffers to 2. Any write
intensive application may benefit by increasing the log buffer sizes.
That's done by setting je.log.totalBufferBytes to the desired value and
setting je.log.bufferSize to the total buffer size/number of buffers.
Note that JE will restrict write buffers to half of je.maxMemory, so it
may be necessary to increase the cache size to grow the write buffers
to the desired degree.
Back
to top
Why is my application performance
slower with transactions?
Many users see a large performance difference
when they enable or disable transactions in their application, without
doing any tuning or special configuration.
The performance difference is the result of the
durability (the D in ACID) of transactions. When transactions are
configured, the default configuration is full durability: at each
transaction commit, the transaction data is flushed to disk. This
guarantees that the data is recoverable in the event of an application
crash or an OS crash; however, it comes with a large performance
penalty because the data is written physically to disk.
If you need transactions (for atomicity, for
example, the A in ACID) but you don't need full durability, you can
relax the durability requirement. When using transactions there are
three durability options:
- commitSync—flush
all the way to disk. This provides durability in the face of an OS or
application crash.
- commitWriteNoSync—write
to the file system buffers but do not force a sync to disk. This
provides durability in the face of an application crash, but data may
be lost in the event of an OS crash.
- commitNoSync—do
not write or sync to disk. This provides no durability guarantees in
the event of a crash.
You can call these specific Transaction methods, or you can call commit
and change the default using an environment parameter.
Without transactions, JE provides the equivalent
of commitNoSync durability.
Note that the performance of commitSync can vary
widely by OS/disk/driver combination. Some systems are configured to
buffer writes rather than flush all the way to disk, even when the
application requests an fsync. If you need full durability guarantees,
you must use a OS/disk/driver configuration that supports this.
The different durability options beg the
question: How can changes be explicitly flushed to disk at a specific
point in time, in a non-transactional application or when commitNoSync
or commitWriteNoSync is used?
In a transactional application, you have three
options for forcing changes to disk:
- You can call Transaction.commitSync explicitly.
This causes the current transaction, as well as all other transactions
already committed at that point in time, to be flushed to disk.
- You can perform a checkpoint explicitly by
calling Environment.checkpoint. A checkpoint writes all internal Btree
nodes to disk and then flushes all outstanding changes to disk. At the
end of the checkpoint, all transactions previously committed will be
durable.
- You can rely on the fact that JE performs
checkpoints periodically in a background thread. Checkpoints are
performed once for every 20 MB of log written, by default. The
frequency of checkpoints can be configured using the
je.checkpointer.bytesInterval environment parameter—see the
example.properties file in the distribution.
In a non-transactional application, you have two options for forcing
changes to disk:
- You can call Environment.sync explicitly. This
provides the same durability guarantees as a checkpoint.
- As for a transactional application, you can
rely on JE background checkpoints as described above.
Back
to top
How can I improve performance of a cursor walk over a database?
Berkeley DB Java Edition (JE) appends records to the log, so they are
stored in the order they are written, that is in time or "temporal"
order. But if the records are written in a non-sequential key-order,
that is the "spatial" ordering is different than the "temporal" order,
then reading them in key order will read in log (disk) order. Reading
a disk in sequential order is faster than reading in random order.
When key order is not equal to disk order, and neither the operating
system's file system cache or the JE cache are "hot", a database scan
will generally be slower because the disk head may have to move on
every read.
One way to improve read performance when key (spatial) order is not
the same as disk (temporal) order, is to preload the database into the
JE cache using the Database.preload() method.
preload() is optimized to read records in disk order, not
key order. See the documentation at:
http://www.oracle.com/technology/documentation/berkeley-db/je/java/com/sleepycat/je/Database.html#preload(com.sleepycat.je.PreloadConfig)
The JE cache should be sized large enough to hold the pre-loaded data
or you may actually see a negative performance impact.
Another alternative is to change the application that writes the
records to write them in key order. If records are rewritten in key
order then a cursor scan will cause the records to be read in a
disk-sorted order. The disk head will be moved a minimum number of
times during the scan, and when it does, it will always move in the
same direction.
There are different ways to reorder the records. If the application
can be taken off-line DbDump/DbLoad can be used to reorder the
records. See the documentation at:
http://www.oracle.com/technology/documentation/berkeley-db/je/java/com/sleepycat/je/util/DbDump.html
and
http://www.oracle.com/technology/documentation/berkeley-db/je/java/com/sleepycat/je/util/DbLoad.html
If the application can not be taken off-line the reordering can be
accomplished by reading keys via a Cursor in either of the following
ways:
- Use a Cursor to iterate over all records and call
Cursor.putCurrent() to re-write each one. When finished,
although the total log size will be doubled, the log cleaner will
eventually clean and delete the old (obsolete) records and hence some
of the log files. If the rewrite (using a Cursor) is done gradually
during the normal operation of the application, this will give the log
cleaner a chance to delete the old files having less impact on disk
space consumption.
- Iterate with a Cursor and use
Database.put() to write
the records to a different environment. When the rewriting is
finished, switch to using the new environment, close the old
environment, and remove its entire directory (delete all its log
files).
Back
to top
What JVM parameters should I consider when tuning an application with a large cache size?
If your application has a large cache size, tuning the Java GC
may be necessary. You will almost certainly be using a 64b JVM (i.e.
-d64), the -server option, and setting your heap
and stack sizes with -Xmx and -Xms. Be sure that you
don't set the cache size too close to the heap size so that your application has
plenty of room for its data and to avoided excessive full GC's.
We have found that the Concurrent Mark Sweep GC is generally the best in this
environment since it yields more predictable GC results. This can be
enabled with -XX:+UseConcMarkSweepGC.
Best practices dictates that you disable System.gc() calls with
-XX:-DisableExplicitGC.
Other JVM options which may prove useful
are -XX:NewSize (start with 512m or 1024m as a value),
-XX:MaxNewSize (try 1024m as a value), and
-XX:CMSInitiatingOccupancyFraction=55. NewSize is typically
tuned in relationship to the overall heap size so if you specify this parameter
you will also need to provide a -Xmx value. A convenient way
of specifying this in relative terms is to use -XX:NewRatio.
The values we've suggested are only starting points. The actual values
will vary depending on the runtime characteristics of the application.
You may also want to refer to the following articles:
Back
to top
Earlier versions of the Java
collections API required that iterators be explicitly closed. How can
they be used with other components that do not close iterators?
As of Berkeley Db, Java Edition 3.0, the
com.sleepycat.collections package is fully compatible with the Java
Collections framework. In previous releases, Collections.size() was not
supported, and collection iterators had to be explicitly closed. These
incompatibilities have been addressed to provide full interoperability
with other Java libraries that use the Java Collections Framework
interfaces.
In earlier versions, the user had to consider the
context any time a StoredCollection is passed to a component that will
call its iterator() method. If that component is not aware of the
StoredIterator class, naturally it will not call the
StoredIterator.close() method. This will cause a leak of unclosed
Cursors, which can cause performance problems. For more on this topic,
see In earlier versions of the Java collections
API, why did iterators returned by the JE collections API need to be explicitly
closed by the caller?
If the component cannot be modified to call
StoredIterator.close(), the only solution is to copy the elements from
the StoredCollection to a standard Java collection, and pass the
resulting copy to the component. The simplest solution is to call the
StoredCollection.toList() method to copy the StoredCollection to a
standard Java ArrayList.
If the StoredCollection is large and only some of
the elements need to be passed to the component, it may be undesirable
to call the toList() method on the entire collection since that will
copy all elements from the database into memory. There are two ways to
create a standard Java collection containing only the needed elements:
- A submap or subset of the StoredCollection can
be created, and then the toList() method can be called on that submap
or subset. The standard Java SortedMap and SortedSet methods may be
used for creating a submap or a subset. The resulting collection will
be a StoredCollection, and its toList() method can then be called.
- The elements of the StoredCollection can be
explicitly iterated, selected and copied to an ArrayList or another
standard Java collection. There are many ways to implement such
application specific filtering using the collections API.
Back
to top
How do I access duplicates using the
Java collections API?
Let's say you have two databases: a primary
database with the key {ID, attribute} and a secondary database with the
key {ID}. How can you access all the attributes for a given ID? In
other words, how can you access all the duplicate records in the
secondary database for a given key?
We must admit at this point that the example
given is partially a trick question. In this particular case you can
get the attributes for a given ID without the overhead of creating and
maintaining a secondary database! If you call
SortedMap.subMap(fromKey,toKey) for a given ID in the primary database
the resulting map will contain only the attributes you're interested
in. For the fromKey pass the ID you're interested in and a zero (or
lowest valued) attribute. For the toKey pass an ID one greater than the
ID you're interested in (the attribute doesn't matter). Also see the
extension method StoredSortedMap.subMap(Object,boolean,Object,boolean)
if you would like more control over the subMap operation.
Note that this technique works only if the {ID,
attribute} key is ordered. Because the ID is the first field in the
key, the map is sorted primarily by ID and secondarily by attribute
within ID. This type of sorting works with tuple keys, but not with
serial keys. In general serial keys do not provide a deterministic sort
order. To use a tuple key using a TupleBinding or another tuple binding
class in the com.sleepycat.bind.tuple package.
But getting back to the general question of how
to access duplicates, if you have a database with duplicates you can
access the duplicates in a number of ways using the collections API: by
creating a subMap or subSet for the key in question, as described
above, or by using the extension method StoredMap.duplicates(). This is
described here:
http://www.oracle.com/technology/documentation/berkeley-db/je/java/com/sleepycat/collections/StoredMap.html#duplicates(java.lang.Object)
http://www.oracle.com/technology/documentation/berkeley-db/je/collections/tutorial/retrievingbyindexkey.html
http://www.oracle.com/technology/documentation/berkeley-db/je/collections/tutorial/UsingStoredCollections.html
Back
to top
In earlier versions of the Java collections API, why did iterators need to be explicitly closed by the caller?
As of Berkeley Db, Java Edition 3.0, the
com.sleepycat.collections package is now fully compatible with the Java
Collections framework. In previous releases, Collections.size() was not
supported, and collection iterators had to be explicitly closed. These
incompatibilities have been addressed to provide full interoperability
with other Java libraries that use the Java Collections Framework
interfaces.
Using earlier releases, if you obtain an Iterator
from a StoredCollection, it is always a
StoredIterator and must be explicitly closed. Closing the iterator is
necessary to release the locks held by the underlying cursor. To avoid
performance problems, is important to close the cursor as soon as it is
no longer needed.
Since the Java Iterator interface has no close()
method, the close() method on the StoredIterator
class must be used. Alternatively, to avoid casting the Iterator
to a StoredIterator, you can call the StoredIterator.close(Iterator)
static method; this method will do nothing if the argument given is not
a StoredIterator.
To ensure that an Iterator is always closed, even
if an exception is thrown, use a finally clause. For example:
Iterator i = myStoredCollection.iterator();
try {
while (i.hasNext()) {
Object o = i.next();
// do some work
}
} finally {
StoredIterator.close(i);
}
Back
to top
How can I check the disk space
utilization of my log files?
This FAQ
explained the basics of JE log files and the concept of obsolete data. The JE
package includes a utility that can be used to measure the utilization level of
your log files. DbSpace gives you information about how packed
your log files are. DbSpace can be used this way:
$ java -jar je.jar DbSpace
usage: java { com.sleepycat.je.util.DbSpace | -jar je.jar DbSpace }
-h <dir> # environment home directory
[-q] # quiet, print grand totals only
[-u] # sort by utilization
[-d] # dump file summary details
[-V] # print JE version number
For example, you might see this output:
$ java -jar je.jar DbSpace -h <environment directory> -q
File Size (KB) % Used
-------- --------- ------
TOTALS 18167 60
which says that in this 18Mb environment, 60% of the disk space used
is taken by active JE log entries, and 40% is not utilized.
Back
to top
How can I find the location of
checkpoints within the log files?
Checkpoints serve to limit the time it takes to
re-open a JE environment, and also
enable log cleaning, as described in this FAQ. By default, JE executes
checkpoints transparently to
the application, but it can be useful when troubleshooting to find the
location of the checkpoints. For example, lack of checkpointing could
explain a lack of log file cleaning
The following unadvertised option to the
DbPrintLog utility provides a summary of
checkpoint locations, at the end of the utility's output. For example,
this command:
java com.sleepycat.je.util.DbPrintLog -h <environment directory> -S
will result in output of this type:
<..snip..>
Per checkpoint interval info:
lnTxn ln mapLNTxn mapLN end-end end-start start-end maxLNReplay ckptEnd
<..snip..>
16,911 0 0 2 2,155,691 2,152,649 3,042 16,911 0x2/0x87ed74
83,089 0 0 2 10,586,792 10,581,048 5,744 83,090 0x3/0x90e19c
0 0 0 0 0 0 0 0 0x3/0x90e19c
The last column indicates the location of the last checkpoint.
The first number is the .jdb file
and the second number is the offset in the file. In this case, the last
checkpoint is in file 00000003.jdb
at offset 0x90e19c. The log cleaner can not reclaim space in any .jdb
files that follow that location.
Note that DbPrintLog is simply looking at the CkptStart and
CkptEnd entries in the log and attempting to derive the checkpoint intervals
from those entries. CkptStart can be missing because it was part of a file
that was cleaned and deleted. CkptEnd can be missing for the same reason, or
because the checkpoint never finished because of an abnormal shutdown.
Note that DbPrintLog -S can consume significant
resources. If desired, the -s option can be used to restrict the amount
of log analyzed by the utility.
Back
to top
What is Carbonado and when should
I use it instead of the Direct Persistence Layer (DPL)?
Carbonado is an open source Java persistence framework. It is published by
Amazon on SourceForge:
http://carbonado.sourceforge.net/
Carbonado allows using Berkeley DB C Edition, Berkeley DB Java Edition, or
an SQL database as an underlying repository. This is extremely useful when an
abstraction that supports an SQL-based backend is a requirement, or when there
is a need to synchronize data between Berkeley DB and SQL databases.
Because it supports SQL databases and the relational model, Carbonado
provides a different set of features than the DPL. The following
feature set comparison may be useful in deciding which API to use.
Feature | Carbonado | Direct Persistence Layer (DPL) |
SQL and relational database support |
Supports the relational model. The underlying repository may be an SQL
database accessed via JDBC, or a Berkeley DB repository, both accessed
using the same Carbonado API. Queries are performed by a simple query
language that is not SQL but follows the relational model. |
Does not support SQL databases or provide a query language. The
relational model can be used, but is not enforced by the API. Queries
are performed by directly accessing primary and secondary indexes. |
Synchronization of two repositories. |
Supported for any two repositories, including Berkeley DB and SQL
repositories. This allows using Berkeley DB as a front-end cache for an
SQL database, for example. |
Not supported. |
CRUD Operations |
CRUD operations are performed using the ActiveRecord design pattern.
CRUD methods are on the persistent object. |
CRUD operations are performed by directly accessing primary and
secondary indexes as in the Data Access Object design pattern, not by methods
on the persistent objects. |
Relationships |
Supports 1:1, 1:many, many:1, many:many. Relationships are traversed
using methods on the persistent object. For X:1 relationships the
method fetches the object. For X:many relationships the method returns
a query that can be executed or composed with other queries. |
Supports 1:1, 1:many, many:1, many:many. Relationships are traversed by
directly accessing primary and secondary indexes. |
Transactions |
Supports transactional use and all four ANSI isolation levels.
Transactions are implicit per thread. |
Supports transactional or non-transactional use and all four ANSI
isolation levels. Transactions are explicitly passed as parameters and M:N
transactions:threads are allowed. |
Persistent Object Model |
JavaBean properties are persistent. Interfaces or abstract classes are
defined to represent relational entities. The Storable interface must
be extended by the interface or implemented by the abstract class.
Instances are created using a Storage factory class. To conform to the
relational model, nested objects are not supported and object graphs are
not preserved. |
Provides POJO persistence. All non-transient instance fields are
persistent. Implementing interfaces or extending a base class is not
required. Instances are created using ordinary constructors. Arbitrarily
nested objects and arrays are supported and object graphs are
preserved. |
Metadata description |
Annotations are used to describe keys and other metadata. |
Annotations are used to describe keys and other metadata. Or, metadata
may be supplied externally to avoid the use of annotations. |
Class evolution |
Adding and dropping fields and indexes is supported. |
Adding, deleting, renaming, widening and converting fields and classes
is supported, as is adding and dropping indexes. |
Optimistic locking |
Supported via a Version property. |
Not supported. |
Triggers |
Supported for all write (create, update, delete) operations. |
Not supported. |
Large objects |
Supported but implemented for Berkeley DB using partial get/put
operations, which do not provide the same level of performance as true
LOB support. |
Not supported. |
Sequences |
Supported. |
Supported. |
Support |
Provided by the Carbonado open source community. |
Provided by Oracle and the Berkeley DB open source community. |
Back to top
How do I prevent "phantoms" when
not using transactions?
Phantoms are records that can appear in the course of performing operations
in one thread when records are inserted by other threads. For example, if you
perform a key lookup and the record is not found, and then you later perform a
lookup with the same key and the record is found, then the record was inserted
by another thread and is called a phantom.
Phantoms and how to prevent them in transactional applications are described
in Writing Transactional Applications under Configuring
Serializable Isolation.
However, you may wish to prevent phantoms but you cannot use transactions.
For example, if you are using Deferred Write, then you cannot use transactions.
For phantoms that appear after a search by key, another technique for
preventing them is to use a loop that tries to insert with putNoOverwrite, and
if the insert fails then does a search by key.
Here is a code sketch using the base API:
Cursor cursor = ...;
DatabaseEntry key = ...;
DatabaseEntry insertData = ...;
DatabaseEntry foundData = ...;
boolean exists = false;
boolean done = false;
while (!done) {
OperationStatus status = cursor.putNoOverwrite(key, insertData);
if (status == OperationStatus.SUCCESS) {
/* A new record is inserted */
exists = false;
done = true;
} else {
status = cursor.getSearchKey(key, foundData, LockMode.RMW);
if (status == OperationStatus.SUCCESS) {
/* An existing record is found */
exists = true;
done = true;
}
/* else continue loop */
}
}
If the putNoOverwrite succeeds, the cursor holds the write lock on the
inserted record and no other thread can change that record. If the
putNoOverwrite fails, then the record must exist so we search by key to lock
it. If the search succeeds, then the cursor holds the write lock on the
existing record and no other thread can modify that record. If the search
fails, then another thread must have deleted the record and we loop again.
This technique relies on a property of JE cursors called "cursor stability".
When a cursor is positioned on a record, the cursor maintains its position
regardless of the actions of other threads. The record at the cursor position
is locked and no other thread may modify it. This is true whether transactions
are used or not, and when deferred write is used.
With this technique it is necessary to use a cursor in order to hold the
lock. Database.get, Database.putNoOverwrite and other Database methods do not
hold a lock when used without an explicit transaction. The same is true of
the corresponding DPL methods: EntityIndex.get, PrimaryIndex.put, etc.
Using this technique is recommended rather than using your own locking (with
synchronized or java.util.concurrent). Custom locking is error prone and almost
always unnecessary.
As an aside, JE cursor stability is slightly different when "dirty read"
(LockMode.READ_UNCOMMITTED) is used. In this case, the cursor will remain
positioned on the record, but another thread may change the record or even
delete it.
Back to top
How do I dump and load the data in
an EntityStore?
The DbDump
and DbLoad
utilities can be used to dump and load all Databases in an EntityStore. They
can be used to copy, backup or restore the data in a store when you don't wish
to copy the entire set of log files in the environment.
The Database Names section of the EntityStore
documentation describes how to determine which database names to dump or
load.
In addition, it is useful to understand the process for dumping and loading
databases in general. Two points are important to keep in mind:
- When loading you will need to specify the names of the databases, so you
should stash this information somewhere when you dump the databases.
- Before loading the databases, you should remove all databases for the store
from the target environment or start with an empty environment.
Here is an example of the steps to dump a store from one environment and
load the dumped output in a second environment.
- In the first environment, get a list of the database names for the store.
This can be done with the
DbDump -l command line option or using
the Environment.getDatabaseNames method. Write this list of
database names to a file. See the Database Names section mentioned above for
information on how to identify all databases for a store.
- For each database name in the list from step 1, dump the database,
appending the output to a dump file using the
DbDump command line
program or the DbDump class.
- In the second environment, if you are not starting with an empty
environment or an environment for which the target store has never been opened,
you must first delete all existing databases for the target store:
- a) Open the Environment but be sure not to open the target
EntityStore.
- b) Get a list of database names for the target store as described in step
1 above.
- c) Call
Environment.removeDatabase for each existing
database.
- For each database name in the list from step 1, load the database from the
dump file using the
DbLoad command line program or the
DbLoad class. Do not open the target EntityStore until all
database are loaded.
- Now the Environment and target EntityStore can be opened and used to access
the loaded data.
Back to top
How can I perform wildcard queries or non-key queries?
Berkeley DB does not have a query language. It has API methods for
performing queries that can be implemented as lookups in a primary or secondary
index. Wildcard queries and non-key must be performed by scanning an entire
index and examining each key and/or value.
In an SQL database or another database product with a query language, a full
index scan is executed when you perform a wildcard query or a non-key query. In
Berkeley DB you write a loop that scans the index. While you have to write the
loop, you'll see better performance than in an SQL database because there is no
SQL processing involved.
Berkeley DB supports simple key lookups as well as prefix or range queries.
Range queries allow you to search for all keys in a range of keys or for all
keys starting with a given prefix value. For more information on range queries
in the Base API, see:
For more information on range queries in the DPL API, see:
Back to top
How can I join two or more primary
databases or indexes?
Berkeley DB has direct support only for intersection (AND) joins across the
secondary keys of a single primary database. You cannot join more than one
primary database.
If you are using the DPL, the same is true: you can only join across the
secondary keys of a single primary index.
For example, imagine a primary database (index) called Person that has two
secondary keys: birthdate and favorite color. Using the Join API, you can find
all Person records that have a given birthdate AND a given favorite color.
When using the Base API, see the Database.join
method. When using the DPL, see the EntityJoin
class.
To perform a join across more than primary database (index), you must write a
loop that iterates over the records of one database (index) and does a lookup
in one or more additional databases (indexes).
In an SQL database or another database product with a query language,
similar iterative processing takes place when the join is executed. In Berkeley
DB while you must write the code that iteratively performs the join, you'll see
better performance than in an SQL database because there is no SQL processing
involved.
Back to top
What is the complete definition of
object persistence?
Persistence is defined in the documentation for the Entity
annotation and the following topics are discussed:
- Entity Subclasses and Superclasses
- Persistent Fields and Types
- Simple Types
- Complex and Proxy Types
- Other Type Restrictions
- Object Graphs
Back to top
How do I define primary keys,
secondary keys and composite keys?
Primary keys and sequences are discussed in the documentation for the PrimaryKey
annotation and the following general topics about keys are also discussed:
- Key Field Types
- Key Sort Order
For information about secondary keys and composite keys, see the SecondaryKey
annotation and the KeyField
annotation.
Back to top
How do I store and query data in
an EntityStore?
Entities are stored by primary key in a PrimaryIndex and how to store
entities is described in the PrimaryIndex class documentation.
Entities may also be queried and deleted by secondary key using a
SecondaryIndex and this is described in the SecondaryIndex
class documentation. The SecondaryIndex
documentation also discusses the four mappings provided by primary and
secondary indexes:
- The mapping from primary key to entity
- The mapping from secondary key to entity
- The mapping from secondary key to primary key
- The mapping from primary key to entity for the subset of entities
having a given secondary key
Both PrimaryIndex and SecondaryIndex implement the EntityIndex interface.
The EntityIndex
documentation has more information about the four mappings along with example
data. It also discusses the following general data access topics:
- Accessing the Index
- Deleting from the Index
- Transactions
- Transactions and Cursors
- Locking and Lock Modes
- Low Level Access
Back to top
How are relationships defined and
accessed?
Relationships in the DPL are defined using secondary keys. A secondary key
provides an alternate way to lookup entities, and also defines a relationship
between the entity and the key. For example, a Person entity with a secondary
key field consisting of a set of email addresses defines a One-to-Many
relationship between the Person entity and the email addresses. The Person
entity has multiple email addresses and can be accessed by any of them.
Optionally a secondary key can define a relationship with another entity.
For example, an Employee entity may have a secondary key called departmentName
that is the primary key of a Department entity. This defines a Many-to-One
relationship between the Employee and the Department entities. In this case we
call the departmentName a foreign key and foreign key constraints are
used to ensure that the departmentName key is valid.
Both simple key relationships and relationships between entities are defined
using the SecondaryKey
annotation. This annotation has properties for defining the type of
relationship, the related entity (if any), and what action to take when the
related entity is deleted.
For more information about defining relationships and to understand how
relationships are accessed, see the SecondaryIndex
documentation which includes the following topics:
- One-to-One Relationships
- Many-to-One Relationships
- One-to-Many Relationships
- Many-to-Many Relationships
- Foreign Key Constraints for Related Entities
- One-to-Many versus Many-to-One for Related Entities
- Key Placement with Many-to-Many for Related Entities
- Many-to-Many Versus a Relationship Entity
Back to top
What is the difference between
embedding a persistent object and a relationship with another entity
object?
There are two ways that an entity object may refer to another object. In
the first approach, called embedding, the referenced object is simply defined
as a field of the entity class. The only requirement is the the class of the
embedded object be @Persistent. For example:
@Entity
class Person {
@PrimaryKey
long id;
String name;
Address address;
private Person() {}
}
@Persistent
class Address {
String street;
String city;
String state;
String country;
int postalCode;
private Address() {}
}
The embedded object is stored in the same record as the entity, in this case
in the Person PrimaryIndex. There is no way to access the Address object
except by looking up the Person object in its PrimaryIndex and examining the
Person.address field. The Address object cannot be accessed independently. If
the same Address object is stored in more than one Person entity, a separate
copy of the address will be stored in each Person record.
You may wish to access the address independently or to share the address
information among multiple Person objects. To do that you must define the
Address class as an @Entity and define a relationship between the Person and
the Address entities using a secondary key. For example:
@Entity
class Person {
@PrimaryKey
long id;
String name;
@SecondaryKey(relate=MANY_TO_ONE, relatedEntity=Address.class)
long addressId;
private Person() {}
}
@Entity
class Address {
@PrimaryKey
long id;
String street;
String city;
String state;
String country;
int postalCode;
private Address() {}
}
With this second approach, the Address is an entity with a primary key. It
is stored separately from the Person entity and can be accessed independently.
After getting a Person entity, you must lookup the Address by the
Person.addressId value in the Address PrimaryIndex. If more than one Person
entity has the same addressId value, the referenced Address is effectively
shared.
Back to top
Why must all persistent classes
(superclasses, subclasses and embedded classes) be annotated?
The requirement to make all classes persistent (subclasses and superclasses
as well as embedded classes) is documented under "Complex and Proxy Types" in
the documentation for the Entity
annotation.
There are a couple of reasons for requiring that all persistent classes be
annotated with @Persistent when using the annotation entity model (the default
entity model).
First, bytecode annotation of persistent classes, to optimize marshaling and
un-marshaling, is performed based on the presence of the annotation when the
class is loaded. In order for this to work efficiently and correctly, all
classes must be annotated.
Second, the annotation is used for version numbering when classes are
evolved. For example, if a superclass instance fields change and that source
code not under your control, you will have difficulty in controlling evolution
of its instance fields. It is for this reason that a PersistentProxy is
required when using external classes that cannot be annotated or
controlled.
If the source code for a superclass or embedded class is not under your
control, you should consider copying the data into your persistent class, or
referencing it and using a PersistentProxy to translate the external class to a
persistent class that is under your control.
If using JE annotations is undesirable then you may wish to subclass EntityModel
and implement a different source of metadata. While annotations are very
unobtrusive compared to other techniques such as implementing interfaces or
subclassing from a common base class, we understand that they may be
undesirable. Note, however, that if you don't use annotations then you won't
get the performance benefits of bytecode annotation.
Back to top
Why doesn't the DPL use the
standard annotations defined by the Java Persistence API?
The EJB3-oriented Java Persistence API is SQL-oriented, and could not be
fully implemented with Berkeley DB except by adding SQL or some other query
language first. There was a public
discussion on TheServerSide of this issue during the design phase for the
DPL.
Note that it is not just the annotation syntax that is different between DPL
and the EJB3 Java Persistence API:
- The APIs themselves are different.
- The DPL, like other Berkeley DB APIs, accessed objects by value, while the
EJB3 Java Persistence API treats object by reference and maintains a
persistence context. This is discussed in TheServerSide thread above.
In general, Berkeley DB's advantages are that it provides better performance
and a simpler usage model than is possible with an SQL database. To implement
the EJB3 Persistence API for Berkeley DB would negate both of these
advantages to some degree.
Back to top
How should I set
directory permissions on the JE environment directory?
If you want to read and write to the JE environment, then you should provide r/w permission on the directory, je.lck, and *.jdb files for JE.
If you want read-only access to JE, then you should either
- Set r/o access on the JE environment directory and open the Environment
for read-only, or
- Set r/w access on the JE environment directory, allow creation and writing
to the <envdir>/je.lck file, and open the Environment in
read-only mode.
If JE finds that the JE environment directory is writable, it will attempt to
write to the je.lck file. If it finds that the JE environment directory is
not writable, it will verify that the Environment is opened for
read-only.
Back to top
Can I perform an efficient
"keys only" query?
As of JE 4.0, key-only queries may be performed and I/O is significantly
reduced if ReadUncommitted isolation is configured. Because JE data records
are stored separately, the I/O to read the data record is avoided when the data
record is not already in cache. Note that if other isolation levels are used,
then the I/O cannot be avoided because the data record must be read in order to
lock the record.
To perform a ReadUncommitted key-only query using the Base API, use any
Database or Cursor method to perform the query and
specify the following:
- Specify ReadUncommitted isolation using
LockMode.READ_UNCOMMITTED or
CursorConfig.READ_UNCOMMITTED.
- Call
DatabaseEntry.setPartial(0, 0, true) on the data
DatabaseEntry, so that JE will not fetch and return the record
data.
To perform a ReadUncommitted key-only query using the DPL, use any
EntityIndex to perform the query and specify the following:
- Specify ReadUncommitted isolation using
LockMode.READ_UNCOMMITTED or
CursorConfig.READ_UNCOMMITTED.
- Call
EntityIndex.keys to obtain a key-only cursor, so that
JE will not fetch and return the record data.
Back to top
Checklist for JE HA
JE HA replicated applications are by their nature distributed
applications that communicate over the network. Attention to proper
configuration will help prevent problems that may prove hard to
diagnose and rectify in a running distributed environment. The
checklist below is intended to help ensure that key configuration
issues have been considered from the outset. It's also a good list to
revisit in case you do encounter unusual problems after your
application has been deployed, in case there were any inadvertent
configuration changes.
-
Hostname configuration: When configuring the hostname component
of
the NODE_HOST_PORT
configuration parameter avoid use of hostnames that resolve to
loopback addresses, except perhaps during development or testing. Use
of loopback addresses should only be used when all the nodes
associated with the replication group reside on the same machine. If
any single node in a group uses a loopback address all the other nodes
must as well, so that they can all communicate with each other.
Also, if multiple machines are in use check that there is a network
path between any two machines. Use a network utility
like ping to verify that a network path exists.
-
Port configuration: When configuring the port component of
the NODE_HOST_PORT
configuration parameter check that the port is available on that
machine and is not also used by some other service. The check can be
done using network utilities like
netstat to list the
services running on each port.
Also, check that the port is reachable from every other host
configured for use by the other nodes in the replication group. In
particular, ensure that a firewall is not blocking use of that port. A
tool like telnet can be used for this purpose.
In large or even medium sized production networks, it's not unusual to
have a person or group of network administrators who are responsible
for the allocation of ports, to help minimize port conflicts, and to
ensure that they are used in a consistent way across the
enterprise. In such situations, it's best to consult with the
networking group while making these network level configuration
decisions.
-
Clock synchronization: Check that the system clocks on machines
hosting replication nodes are reasonably synchronized. Verify that
clocks will continue to remain synchronized through use of a
synchronization daemon like ntpd. Use the time
synchronization mechanism that's best suited to your particular
platform. See
Time
Synchronization for further details.
-
Helper configuration:
The HELPER_HOSTS
configuration is used when a new replication node is created. When a
node is configured with the node itself as the sole helper, (when
HELPER_HOSTS is the same as NODE_HOST_PORT, and the environment
directory is empty,) JE HA creates a brand new replication group with
this node as its first and only member. There must be exactly one node
(the initial node in the group) that's configured in this way. All
other nodes must specify helper nodes that are already members of the
replication group. Doing so permits the addition of the node to the
group. See
Replication Group Startup
for information on starting a replication group.
It's good practice to verify that the replication group is composed of
exactly the nodes that you expect. The JE HA
utility DbDumpGroup can be used to display the nodes that
are currently members of the replication group. For example, the
following invocation of DbDumpGroup will print out all the nodes
present in the replicated environment that's stored in the directory:
/tmp/env.
java -jar ~/work/je/build/lib/je.jar DbDumpGroup -h /tmp/env
-
Logging configuration: JE HA uses java.util.logging with the default
logging level set to INFO. The logging output is available in the
je.info file created in the environment's home directory. The
information generated at the INFO level should be comprehensible to
the user familiar with the replication group life cycle and can be
extremely useful when monitoring your application's activities. Please
check that it remains set at this level or higher.
See
Logging for further details.
-
JConsole:If your security environment permits it, consider
setting the system property -DJEMonitor=true, when starting up your
replicated application. This setting will permit you to use JConsole
in conjunction with the JEJConsole.jar plugin to monitor your
application at runtime and make it easy to diagnose issues that may
turn up. The
JConsole how-to contains details about this tool.
Back to top
|