FAQ - Oracle NoSQL Database


updated: April 7, 2014

General

API


Troubleshooting

 

General


Difference between CE vs EE?


Oracle NoSQL Database 12cR1 Enterprise Edition includes all features available in Community Edition, plus these additional features:

  • Simple Network Management Protocol (SNMP) adapter for monitoring
  • Oracle Database Integration via External Tables: Access data through SQL queries
  • Integration with Oracle Event Processing (OEP) engine using NoSQL database cartridge: Access data through CQL queries
  • Integration with Oracle Coherence to permit Oracle NoSQL Database to be used as a cache for Oracle Coherence applications and to allow applications to directly access cached data from Oracle NoSQL Database.
  • A Java-based interface to store and query semantic data through support for the Resource Description Framework (RDF), SPARQL query language, and a subset of the Web Ontology Language (OWL) are now supported. These capabilities are referred to as the RDF Graph feature of Oracle NoSQL Database.
  • Enterprise Grade Support (24x7) for Enterprise Edition (EE). Community Edition (CE) support on subscription is only from 9:00 - 5:00 pm local time.

  
For more information check the Data Sheet available on the overview page.
 

Back to top

 

What are all these ports? When do I use which one?


A store deployment consists of many components, each running in its own process, which communicate with each other. The system administrator must define the following TCP/IP ports for each type of process on each Storage Node (SN).

  • The main port on each SN is the "registry port", which is owned by the Storage Node Agent (SNA). This is the port that is named by the "-port" argument to the java -jar kvstore.jar makebootconfig command. It is also used in the deploy-sn plan during configuration. The documentation examples use 5000 as the registry port. Also, the java -jar kvstore.jar kvlite command uses port 5000 as a default value.
  •  Each Admin process must have a port (the "Admin Console port") on which it listens for HTTP connections. This is specified in the "-admin" argument to the java -jar kvstore.jar makebootconfig command, and in the deploy-admin plan during configuration.

    The documentation examples use 5001.

  •  Each SN must be given a range of ports (the "HA Range") to be used by any Replication Nodes and Admin services hosted within the SN for the replication of persistent data. The SNA manages this allotment of ports, reserves one for an Admin service, and uses the rest to allocate one per RN. The range should be sized so it is equal to the maximum number of RNs that this SN may ever host, plus 1.
  • If you would like to enable SNMP monitoring, you must specify ports as specified in the Admin Guide, Standardized Monitoring.

You should use the registry port in the following situations:

  • any use of the admin Command Line Interface (the java -jar kvstore.jar runadmin command), including initial configuration of the store. Note that although you're ultimately communicating with an Admin process, it is the Registry Port, not the Admin Console port, that you use in this situation.
  • as part of the helperHostPort argument to the KVStoreConfig constructor in application code that will read/write to the store.
  • as the value passed on the command line with "-port", to invoke utilities such as Load, Ping, and KVLite.
  • if JMX monitoring is enabled, as the port number for the JMX service.

Similarly with example programs such as HelloBigDataWorld and SchemaExample.

You should use the Admin Console port in the target URL for the admin web UI:

    http://node1.example.com:5001/


Generally you can specify a port on any SN that hosts a component of the type you are trying to reach, and if necessary the system will redirect your request automatically and transparently. In the case of the Admin Console web UI, and the java -jar kvstore.jar runadmin command, this works as long as you specify an SN that is hosting an admin process.

Other than during configuration of an SN, there is normally no need to refer directly to ports in the HA Range, although it is certainly helpful to understand the HA Range when examining Ping results and perusing system log files.

Back to top

 

How do I restrict the ports used by NoSQL DB?


It is sometimes necessary for deployments to constrain a store to a limited set of ports, usually for security or data center policy reasons. There are 2 port ranges that can be specified on a Storage Node to restrict use to those ports:

  • the haPortRange parameter, specified in makebootconfig using -harange <start,end>. This is is used for replication when replication factor is >1, and must contain enough ports to satisfy the capacity of the storage node plus one.
  • the servicePortRange parameter, specified in makebootconfig using -servicerange <start,end>. This range is used for internal (administrative) server to server, low-bandwidth communication and is based on the number of services in each process hosted by a storage node, including the storage node agent process itself. This parameter is optional and if not specified anonymous ports will be used. 

The ranges above are inclusive.

The size of the haPortRange is described above. The servicePortRange sizing is as follows:

  • 3 ports for the Storage Node process
  • 3 * capacity for replication nodes (each replication node consumes 3 ports)
  • 2 if the storage node is hosting an admin process

Using the information above, a storage node with capacity 2 and hosting an admin would need a range of size 3 + (2 * 3) + 2 = 11. The storage node will enforce the minimum size based on this formula. It is recommended that the range be slightly oversized.

SNMP ports and the Storage Node's registry port are specified independently. These can come from the range but that means the range must be larger to compensate as in-use ports in the range are skipped during allocation.

Back to top 

How do I specify the network interface used by NoSQL DB?


The -hostname parameter to makebootconfig influences the network interface used by each Storage Node. It's possible to use the optional -hahost flag for makebootconfig to set the haHostname Storage Node parameter. This is used to specify an additional network interface for the server-server communication channels that carry the most traffic. -hahost defaults to the same value as -hostname.

How does NoSQL DB budget memory?


The Replication Node manages the data in a NoSQL DB store and is the main consumer of memory. The Java heap and cache size used by the Replication Node can be important performance factors. By default, the Replication Node heap and cache are calculated by NoSQL DB based on the amount of memory available to the Storage Node.

We recommend that you specify the available memory for a Storage Node using the -memory_mb flag for makebootconfig, or the memory_mb Storage Node parameter. If you do not define memory_mb, it will default to the memory available on the node. NoSQL DB will then use 85% of memory_mb as the heap for the Replication Node processes hosted by that Storage Node. If the Storage Node hosts more than one Replication Node, the memory will be divided evenly between all RNs. If the number of Replication Nodes on a Storage Node changes, the per-RN memory will be recalculated dynamically. The percentage used for heap is controlled by the rnHeapPercent Storage Node parameter. You can choose to override the default value of 85%.

Each Replication Node uses a cache, and the size of that cache defaults to 70% of the Replication Node heap. You can override the 70% default by setting the rnCachePercent Replication Node parameter.

The Replication Node heap can also be specified directly by setting the -Xmx in the Replication Node javaMiscParams parameter. Likewise, the Replication Node cache can be set directly with the cacheSize Replication Node parameter. While that's possible, it's advisable to use the Storage Node memory_mb setting.

As an example, suppose you specify that a Storage Node may use 3000 MB of memory, by setting memory_mb to 3000. If that Storage Node hosts two Replication Nodes, the heap for each RN will be (3000 * .85)/2 = 1275MB. Each RN cache will be (1275 * .70) = 892MB.  

Back to top

 

How many Admin services should I deploy?


NoSQL DB Administrative processes are replicated in order to enhance reliability. It's important to continue to have administrative functionality available even in the event of node failure, so that you can continue to monitor, troubleshoot, and repair problems.

Each Admin service is created and added to NoSQL DB through the deploy-admin plan. Since the Admin persists metadata into a replicated store, the information provided in the Admin Guide's "Identify your Replication Factor" section also applies to the number of Admin services that should be deployed. For durability and availability, we strongly suggest that you deploy at least three Admin services. If your deployment has only one or two Admin services, and one service goes down, you will be unable to execute many types of administrative commands.

If you have at least three Admin services running, you can easily move one of them from one Storage Node to another by first deploying a fourth Admin service on the target Storage Node, and then eliminating one of the original three using the remove-admin plan. 

Back to top

 

Should I ever create multiple Storage Nodes on the same node?


We strongly recommended that Storage Nodes (SNs) be allocated one per node in the cluster for availability and performance reasons. If you believe that a given node has the I/O and CPU resources to host multiple Replication Nodes, the Storage Node's capacity parameter can be set to a value greater than one, and the system will know that multiple RNs may be hosted at this SN. This way, the system can:

  • ensure that each Replication Node in a shard is hosted on a different Storage Node, reducing a shard's vulnerability to failure
  • dynamically divide up memory and other hardware resources among the Replication Nodes
  • ensure that the master Replication Nodes, which are the ones which service write operations in a store, are distributed evenly among the Storage Nodes both at startup, and after any failovers.

If more than one SN is hosted on the same node, multiple SNs are lost if that node fails, and data may become inaccessible.

You can set the capacity parameter for a Storage Node several ways:

  • when using the makebootconfig command
  • with the change-policy command
  • with the plan change-params commmand.

In very limited situations, such as for early prototyping and experimentation, it can be useful to create multiple SNs on the same node.

On a single machine a Storage Node is uniquely identified by its root directory (KVROOT) plus a configuration file name, which defaults to "config.xml." This means you can create multiple SNs this ways:

Create a unique KVROOT for each SN. Usually, these would be on different nodes, but it's also possible to have them on a single node. For example, if you decide to put all SNs in the directory /var/kv/stores you might create and start more than one this way:

mkdir /var/kv/stores/root1
mkdir /var/kv/stores/root2
mkdir /var/kv/stores/root3
java -jar kvstore.jar makebootconfig -root /var/kv/stores/root1 -host fooHost 
  -port 5000 -admin 5001 -harange 5005,5010 -capacity 1
java -jar kvstore.jar makebootconfig -root /var/kv/stores/root2 -host fooHost 
  -port 5020 -harange 5025,5030 -capacity 1
java -jar kvstore.jar makebootconfig -root /var/kv/stores/root3 -host fooHost 
  -port 5040 -harange 5045,5050 -capacity 1
java -jar kvstore.jar start -root /var/kv/stores/root1
java -jar kvstore.jar start -root /var/kv/stores/root2
java -jar kvstore.jar start -root /var/kv/stores/root3
 
Back to top  


How do I script NoSQL DB configuration?


You may find that you want to build the same NoSQL DB configuration repeatedly for testing purposes. The Admin CLI commands can be scripted in several ways.

Many uses of the Admin CLI are simple commands, such as java -jar kvstore.jar makebootconfig to initially configure a StorageNode, shown above. These are as amenable to scripting as any other UNIX commands and will not be discussed further here.

The interactive commands available in java -jar kvstore.jar runadmin, among which are those used to create and execute plans, can be scripted in two ways. You can create a file containing the sequence of commands that you want to run, and run them in a batch using java -jar kvstore.jar runadmin load -file <script>.

For example, a script file named deploy.kvs could contain commands such as the following:

  configure -name mystore
  plan deploy-datacenter -name boston -rf 3 -wait
  plan deploy-sn -dcname boston -host localhost -port 5000 -wait
  plan deploy-admin -sn sn1 -port 5001 -wait

You could execute this script by issuing the command

  java -jar kvstore.jar runadmin -host localhost -port 5000 load -file 
    deploy.kvs

Another way to script commands is to run individual CLI commands as separate shell command lines. Trailing arguments in this command line are considered to be a single CLI command.

This usage mode lets you use features of a more capable scripting language, such as a UNIX shell, and provides more flexibility for integrating NoSQL DB commands with other commands that are not available in the the interactive java -jar kvstore.jar runadmin environment.

The same sequence of commands as those from the example above could be couched in a shell script this way:

  #!/bin/sh
 
  HOST=localhost
  PORT=5000
  HTTPPORT=5001
  KVADMIN="java -jar lib/kvstore.jar runadmin -host $HOST -port $PORT"
 
  # Each CLI command that follows "$KVADMIN" is executed in a new invocation of 
    runAdmin
  $KVADMIN configure -name mystore
  $KVADMIN plan deploy-datacenter -name boston -rf 3 -wait
  $KVADMIN plan deploy-sn -dcname boston -host localhost -port $PORT -wait
  $KVADMIN plan deploy-admin -sn sn1 -port $HTTPPORT -wait

  Back to top 

Does NoSQL Database Interact With Oracle Database?


NoSQL Database supports retrieving records through the Oracle Database External Table functions. This makes it possible to perform some queries from Oracle Database and retrieve records from NoSQL Database. For more information, refer to the Javadoc for the oracle.kv.exttab package and the example program.


API


How to efficiently delete a subtree of keys using multiple major keys


The KVStore.multiDelete() methods can delete entire subtrees, but only on one (fully-specified) major key at a time.

In order to delete larger subtrees, rooted higher up within the major-key portion of the hierarchy, you can use a storeKeysIterator to find each full major key to be deleted, and then call multiDelete() on each one.

If the storeKeysIterator were allowed to finish, it would iterate over all keys in the desired subtree range, including minor keys within each major key. Therefore, the trick is to take just the first result key from the iterator, extract the major key portion for the call to multiDelete(), and then abandon the iterator and start over with a new one; rather than cycling all the way through a single iterator.

This is illustrated in the following sample code.

Key partialKey = ...;     // major key prefix
 
/*
 * Since we never take more than one key per iterator, a larger 
 * batch size would be a waste.
 */
final int batchSize = 1;
for (;;) {
    Iterator<Key> i = kv.storeKeysIterator
        (Direction.UNORDERED, batchSize, partialKey,
         null, Depth.DESCENDANTS_ONLY);
    if (!i.hasNext()) {
        break;
    }
    Key descendant = Key.createKey(i.next().getMajorPath());
    kv.multiDelete(descendant, null,
                   Depth.PARENT_AND_DESCENDANTS);
}
 
Back to top 

Best practices when generating sequence values


A NoSQL DB application can implement sequences in a fairly simplistic way by designating a record in the store to be the sequence. Simply pick a single key that identifies the sequence. The key can be anything the application chooses, for example, "app-sequence-1". The value for this record will be a long integer, which is the next available value in the sequence.

Use a read-modify-write operation to read the next available sequence, and increment it. Use KVStore.get() to read the value and putIfVersion() to write it. Like any read-modify-write operation, if putIfVersion() fails this means another client updated it, and the read-modify-write cycle should be retried.

Rather than incrementing the sequence by 1, increment it by a much larger value, say 500. This effectively assigns a block or range of sequence values to the client. The client should then "cache" this range of sequence numbers and assign record keys from this range. When the cached values have all been assigned, another read-modify-write operation should be performed.

Although the single sequence record will be a single point of contention among clients, this potential performance issue can be addressed by assigning sequences in large blocks as described. You could also allocate multiple independent sequences for different purposes, such as for different types of records or for key ranges.
 

Back to top

 

How to use Avro unions to encode optional fields


To declare an optional field in Avro, use an Avro union. For example:

{
     "name":"data",
     "type":["null","string"],
     "default":"null"
}

For the above example, the type for the "data" field can be either "string" or "null". That the type can be "null" makes the field optional.

When you use unions in this way, be aware that the union is encoded in JSON in one of the two ways:

  • If the type is "null" then the field is ecoded as JSON null.
  • Otherwise, it is encoded as a JSON object with one name/value pair whose name is the type's name, and whose value is the recursively encoded value. For Avro's named types (record, fixed, or enum), the user-specified name is used. For any other type, the type name is used.

For example, to use the union schema:

["null", "string", "Foo"]

where "Foo" is a record name, you would encode:

  • "null" as "null"
  • the string "a" as {"string" : "a"}
  • a Foo instance as {"Foo" : {...}}, where {...} indicates the JSON encoding of a Foo record instance.

That is, to encode the previously mentioned "data" type as a string, you would use this:

{"data":{"string":"My Data String"}}

To encode the same field as a Foo record instance, you would use:

{"data":{"Foo":{...}}   

(you would fill in the '{...}' with the actual Foo record encoding)

And to encode the same field as a null field, simply use:

{"data":null}
 
Back to top 

Runtime versus checked exceptions


Most of the exceptions defined in the client API are runtime exceptions. Are there any design policies or reasons we don't use checked exceptions?

Checked exceptions are only appropriate when they must always be handled by the immediate caller because they are in a sense a different type of return value. These are called "contingencies". For example, oracle.kv.OperationExecutionException contains information about the return values of the operations in a multi-operation request. Contingency exceptions are checked exceptions, in order to require the caller to handle them explicitly, just as the caller would handle a return value.

Other exceptions, called "faults", are typically the result of a problem that cannot be handled directly by the caller, and are normally caused by an unusual event. They may cause operation retries, but are often handled by higher levels (not the immediate caller). Because they are not handled by the immediate caller, they are runtime exceptions. This avoids cluttering all methods of the client application with throws declarations for exceptions that are not the concern of these methods.

Currently, the only contingency (checked) exception in NoSQL DB is the OperationExecutionException and other exceptions are fault (runtime) exceptions. Each exception extends either oracle.kv.FaultException or oracle.kv.ContingencyException to make this characteristic clear.

Some general information on contingency and fault exceptions is explained here: here

Note in that in recent times (the last several years), runtime exceptions are preferred by most programmers and most programming languages that were designed recently do not have the capability for checked exceptions. So the current trend is toward runtime exceptions. Of course, not all programmers or programming language designers agree on this topic. NoSQL DB reserves the option of adding additional contingency exceptions, if appropriate.

 

Back to top

 

How do I implement numeric keys and have them sort properly?


Oracle NoSQL database treats all keys as String and therefore sorts them as strings. This can present problems for applications that have numeric keys and want them to sort properly. The simple string representation of numbers (integer, long, double, etc) will not sort correctly. For example consider these integer values:

{1, 2, 10, 12, 1000}

That is the proper ascending sort order for these values but their simple string representations sort in this way:

{"1", "10", "1000", "12", "2"}

One simple-minded approach to fixing this is to left pad the string values so they are the same length. For example:

{"0001", "0002", "0010", "0012", "1000"}

That works, but has several features which cause problems for some applications:

  1. It is not particularly compact, which is a valuable property for keys
  1. It does not handle negative numbers
  1. It does not handle floating point numbers

Another approach that can work if negative and floating point numbers are not required is to encode in base32hex, which sorts properly and is somewhat more compact than manual padding. There are no standard implementations so one must be found. It is worth mentioning that Base64, while compact, does not sort properly and should not be used as an encoding if proper sorting is required.

There are at least a few implementations of more general mechanisms that are somewhat compact and work for negative numbers, as well as floating points. One is built into Lucene:

http://lucene.apache.org/core/3_0_3/api/core/index.html?org/apache/lucene/util/NumericUtils.html

Another is discussed here:

http://jasonfager.com/770-lexi-sortable-number-strings/

and implemented here:

https://gist.github.com/jfager/490993#file_lexi_sortable.java

The discussion mentions that an even more compact representation can be used but must be coded by the application.

 

Back to top

 

Troubleshooting


Which JVMs can I use to run Oracle NoSQL Database?


Oracle NoSQL Database Release 1 and 2 require a minimum of Java SE 6 (jdk 1.6.0 u25). Release 3 requires a minimum of Java SE 7, and has been tested and certified against Oracle JDK 7u51. We strongly encourage you to upgrade to the latest Java releases to take advantage of the latest bug fixes and performance improvements.

Why am I seeing an UnsupportedClassVersionError exception?

Oracle NoSQL Database Release 1 and 2 require a Java SE 6 or later Java runtime. Release 3 requires Java SE 7 or later. If you see an error message like the following, you are probably trying to run it with an earlier version of Java.

Exception in thread "main" java.lang.UnsupportedClassVersionError: Bad version number in .class file

or

Exception in thread "main" java.lang.UnsupportedClassVersionError:
  oracle/kv/impl/util/KVStoreMain : Unsupported major.minor version 51.0 

 

Back to top

 

Getting feedback while an administrative plan is running

There are several ways to track the progress of an administrative command.

  • The show plan -id <id> will display the latest status of the command.
  • The show topology command will display the current layout of the store, and the NoSQL DB services that currently exist.
  • The Topology tab of the Admin Console will refresh as NoSQL DB services are created and brought online.
  • You can issue the verify command via the Topology tab or the Admin CLI concurrently as plans are executing. The verify will provide service status information as services come up.
  • You can follow the storewide log via the Logs tab of the Admin Console or the CLI's logtail command.

 

Back to top

 

Where to find error information


Error information for administrative plans is available as the plan executes and finishes. Errors are reported in the plan history each time an attempt to run the plan fails. The plan history can be seen via the command line interface KVAdmin (also known as "java -jar kvstore.jar runadmin" or simply "the CLI") show plan -id <id> command, or in the Plans And Configuration tab of the Admin Console. Using the -verbose flag will add more detail.

Other problems may occur asynchronously. You can learn about unexpected failures, service downtime, and performance issues through the critical events display in the Logs tab of the Admin Console, or through the CLI's show events command. Events come with a time stamp, and the description may contain enough information to diagnose the issue. In other cases, more context may be needed, and the administrator may want to see what else happened around that time.

The store wide log consolidates logging output from all services. Browsing this file may give the administrator a more complete view of activity during the problem period. It can be viewed via the Logs tab of the Admin Console or the CLI's logtail command, or by directly viewing the <storename>_N.log file in the <KVHOME>/<storename>/log directory. It's also possible to download the store wide log file using the Logs tab of the Admin Console.

 

Back to top

 

What is the format of the .log files?


The .log files in the KVROOT/log directory contain trace messages from each NoSQL DB component. Each line is prefixed with the date of the message, its severity, and the name of the component which issued it. For example:

10-01-11 08:39:43:548 UTC INFO [admin1] Initializing Admin for store: kvstore

The line format is MM-dd-yy HH:mm:ss:SSS <java.util.logging.Level> [Component name] message. When looking for more context for events at a given time, use the timestamp and component name to narrow down the section of log to peruse. Serious exceptions are logged with the SEVERE log level, as well as being available in the Admin Console and CLI (java -jar kvstore.jar runadmin) show events command.

 

Back to top

 

What is the format of the .perf files?


The administrative service collects server side throughput and latency statistics on a per-replication node basis. These statistics can be seen via the Admin Console's topology tab, the command line interface (CLI) show perf command, and in the <storename>.perf files available in the master Admin service's KVROOT log directory. The .perf files can also be located and downloaded via the Admin Console.

Throughput and latency are calculated for an interval of time determined by the statsInterval replication node parameter, which defaults to 60 seconds. Each line of the .perf file is for a single stats interval of a single replication node. Each line also applies to a single optype category.

The two optypes, Single and Multi, are used to understand the performance characteristics of different data operations. Some client API data operations pertain to a single data record, while others pertain to one or multiple records. For example, the KVStore.get() method fetches a single data record, while KVStore.multiGet() fetches one or more records. All single record operations are aggregated and recorded as optype Single, while all multi record operations are reported as optype Multi.

As of NoSQL 2.0.25, latency information is calculated per request. Since multi operations can amortize the per-record overhead over many records, they can be significantly more efficient than single record operations. Statistics are kept per optype to help illuminate that difference.

For example, if the application issues KVStore.get(), KVStore.put(), KVStore.putIfVersion() and KVStore.multiGet() operations in one interval, all the multiGets are recorded as one Multi statistic line, and all the other operations are recorded together as one Single statistic line.

The following columns are present on each line:

Resource: the name of the replication node
Time: the start of the stats interval, formatted as MM/HH/YY hh:mm:ss
OpType: Single or Multi

Interval performance stats:

Total Ops: number of data records for that optype executed during the stats interval
PerSec: number of data records per second for that interval
Total Req: For Single optypes, the number of operations and the number of requests is the same for each interval. For Multi optypes, the number of operations may be equal or greater than the number of requests, because each request may return multiple records.
Min: minimum request latency, in milliseconds for that interval
Max: maximum request latency, in milliseconds for that interval
Avg: average request latency, in milliseconds for that interval
95th: latency of requests in the 95th percentile, for that interval
99th: latency of requests in the 99th percentile, for that interval

Cumulative performance stats:

Total Ops: number of operations of that optype executed during the lifetime of the replication node process. Cumulative values are reset any time the replication node restarts
PerSec: number of operations per second for the RN process lifetime
Total Req: For Single optypes, the number of operations and the number of requests is the same. For Multi optypes, the number of operations may be equal or greater than the number of requests, because each request may return multiple records.
Min: minimum latency, in milliseconds for the RN process lifetime
Max: maximum latency, in milliseconds for the RN process lifetime
Avg: average latency, in milliseconds for the RN process lifetime
95th: latency of operations in the 95th percentile, for the RN process lifetime
99th: latency of operations in the 99th percentile, for the RN process lifetime

 

Back to top

 

What are the NoSQL DB service states?


There are three types of NoSQL DB services: Admin, Storage Node, and Replication Node. Each service has a status that can be seen via the Topology tab in the Admin Console, the show topology command in the CLI (java -jar kvstore.jar runadmin), or java -jar kvstore.jar ping.

The status values are:

STARTING Service is coming up
RUNNING Service is running normally
STOPPING Service is stopping. This may take some time. For example, a Replication Node may be performing a checkpoint, or a Storage Node may be shutting down managed services
WAITING_FOR_DEPLOY The service is waiting for commands or acknowledgements from other services during its start up processing. If it is a Storage Node, it is waiting for the initial deploy-SN command. Other services should transition out of this phase without any administrative intervention from the user
STOPPED An intentional, clean shutdown
ERROR_RESTARTING Service is in an error state and restart will be attempted
ERROR_NO_RESTART Service is in an error state and will not be automatically restarted. Administrative intervention from the user is required
UNREACHABLE Service is not reachable by the Admin. If the status was seen via a command issued by the Admin, this state may mask a STOPPED or ERROR state .

A healthy service begins in STARTING. It may transition to WAITING_FOR_DEPLOY for a short period before going on to RUNNING. ERROR_RESTARTING and ERROR_NO_RESTART indicate that there has been a problem that should be investigated. An UNREACHABLE service may only be in that state temporarily, although if that state persists, the service may be truly in an ERROR_RESTARTING or ERROR_NO_RESTART state.

Note that the Topology tab only shows abnormal service statuses. Service statuses for healthy components that are RUNNING are left blank.
 

Back to top

 

Why is my plan still running after an error has been reported?


Administrative plans may invoke remote services which execute asynchronously. Each plan step, or task, can run for a maximum amount of time which is configured by the task_timeout parameter, which defaults to 5 minutes. The task ends when it determines a success or error outcome, or if the timeout period is exceeded.

The asynchronous remote services may encounter errors which are reported directly back to the Admin Console and are displayed in a error dialog before the task has determined the outcome. Because of that, the user may learn of the error while the Admin service still considers the plan to be RUNNING and active. The plan will eventually see the error and will transition to an ERROR state.  

Back to top

 

I configured my Storage Node registry port incorrectly


If you have specified an invalid value for a Storage Node (SN) registry port, the Storage Node Agent will not start up properly. You'll be unable to see the SNA through the jps -m command, and it will not respond to the java -jar kvstore.jar ping command. The snaboot_0.log file in the SN's root directory will display error information. For example, if the registry port was already in use, the log might show:

10-03-11 22:47:59:525 EDT SEVERE [snaService] Failed to start SNA: 
    Port already in use: 1000; nested exception is: 
	java.net.BindException: Permission denied

You should delete the KVROOT directory and repeat the makebootconfig installation steps

 

Back to top

 

I configured my Storage Node HA port range incorrectly


If you have specified invalid values for the HA Port Range (described above in the FAQ), you will be unable to deploy a Replication Node (RN) or a secondary Administration process (Admin) on this SN. The problem will be discovered when you first attempt to deploy a store or a Admin Replica on that faulty SN. You will see these indications that the RN did not come up on this Storage Node:

  • The Admin Console will display an error dialog that warns that this RN is in the ERROR_RESTARTING state. The Topology tab will also show this state in red, and after a number of retries, will indicate that the RN is in ERROR_NO_RESTART..
  • The plan will go into ERROR state, and its detailed history, available by clicking on the plan in the Plans and Configuration tab of the Admin Console or through the java -jar kvstore.jar runadmin "show plan <planId>" command, will show an error message like this:
       Attempt 1
            state: ERROR
            start time: 10-03-11 22:06:12
            end time: 10-03-11 22:08:12
            DeployOneRepNode of rg1-rn3 on sn3/farley:5200 [RUNNING] failed. 
           ....  Failed to attach to RepNodeService for rg1-rn3, see log, 
             /KVRT3/<storename>/log/rg1-rn3*.log, on host farley for more 
             information.
    
  • The critical events mechanism, accessible via the Admin Console or CLI will show an alert that contains the same error information from the plan history.
  • An examination of the specified .log file or the storewide log displayed in the Log tab of the Admin Console will show a specific error message, such as:
    [rg1-rn3] Process exiting
    java.lang.IllegalArgumentException: Port number 1 is invalid because the 
      port must be outside the range of "well known" ports
    

The misconfiguration may be addressed with these steps. Some steps must be executed on the physical node which hosts the NoSQL DB Storage Node, while others can be done from any node which can access the Admin Console or Admin CLI.

  1. Using the Admin Console or Admin CLI, cancel the deploy-store or deploy-admin plan which ran afoul of the misconfiguration.
  2. On the SN node, kill the existing, misconfigured StorageNodeAgentImpl process and all its ManagedProcesses. You can distinguish them from other processes because they will have the parameter "-root <KVROOT>"
  3. On the SN node, remove all files from the KVROOT directory.
  4. On the SN node, recreate the storage node bootstrap configuration file in the KVROOT directory using the java -jar kvstore.jar makebootconfig command.
  5. On the SN node, restart the storage node using java -jar kvstore.jar start.
  6. Using the Admin Console or Admin CLI, re-deploy the storage node using the deploy-sn plan.

You have now returned to the same point where you previously experienced the error, and you can create and execute a deploy-store or deploy-admin plan which uses the same parameters as the initial attempt.

 

Back to top

 


java.net.NoRouteToHostException thrown during configuration


Some users have seen java.net.NoRouteToHostException during configuration. Even though it is possible to ping (using the Unix or Windows CLI ping command, not the Oracle NoSQL Database "ping" command) and ssh to the target machine, configuring Oracle NoSQL Database may still fail. A typical exception might look like this:

[nosql@nosql0 kv-1.2.123]$ java -jar ./lib/kvstore-1.2.123.jar ping -port 5000 
  -host nosql1.example.com
Exception in thread "main" java.rmi.ConnectIOException: Exception creating 
 connection to: nosql1.example.com; nested exception is:
        java.net.NoRouteToHostException: No route to host
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:632)
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
        at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:340)
        at sun.rmi.registry.RegistryImpl_Stub.list(Unknown Source)
        at oracle.kv.util.Ping.getTopology(Ping.java:332)
        at oracle.kv.util.Ping.main(Ping.java:104)
        at oracle.kv.impl.util.KVStoreMain$8.run(KVStoreMain.java:218)
        at oracle.kv.impl.util.KVStoreMain.main(KVStoreMain.java:319)
Caused by: java.net.NoRouteToHostException: No route to host
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect
         (AbstractPlainSocketImpl.java:327)
        at java.net.AbstractPlainSocketImpl.connectToAddress
         (AbstractPlainSocketImpl.java:193)
        at java.net.AbstractPlainSocketImpl.connect
         (AbstractPlainSocketImpl.java:180)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
        at java.net.Socket.connect(Socket.java:546)
        at java.net.Socket.connect(Socket.java:495)
        at java.net.Socket.(Socket.java:392)
        at java.net.Socket.(Socket.java:206)
        at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket
         (RMIDirectSocketFactory.java:40)
        at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket
         (RMIMasterSocketFactory.java:146)
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
        ... 8 more
[nosql@nosql0 kv-1.2.123]$

Generally, this has been due to iptables running on one or more of the machines being used. To check your iptables settings, you can do iptables -L -n. This might produce output like this:

[root@nosql1 init.d]# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state RELATED,
 ESTABLISHED
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           state NEW tcp 
 dpt:22
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with 
 icmp-host-prohibited
 
Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with 
 icmp-host-prohibited
 
Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
[root@nosql1 init.d]#

To get around this problem, you should create rules that allow communication over the ports that you have configured for Oracle NoSQL Database. As a simple workaround to test whether iptables is the cause, you can shutdown iptables temporarily (but note that this will disable your firewall rules which is a potential security issue).

Thanks to Johan Louwers for writing this up.

 

Back to top

 

What does java.rmi.ConnectException: Connection refused to host mean?


The show plan -id <id> command displays plan status and any errors that may have occurred. If you see this exception listed in the error section, it means that the Admin service was unable to reach one of the NoSQL DB components while it was trying to execute a command.

The first step is check the error information displayed from the show plan -id <id> command to look at the error displayed by the component that failed to start. This problem is most often caused by a Replication Node that could not be started. Some common issues are insufficient memory for the Replication Node process, a mistyped custom JVM configuration, or a port that was already in use.

If this first step does not show the problem, the next step is check on the overall status of the store. One way to do that is to use the show topology command, followed by the verify command. The show topology command will display the layout of the store, while verify will check the status of each component. A component that can't be reached will display a status of UNREACHABLE.

The next step would be to look in the NoSQL log files for more detailed error information about the unreachable component. Look first in the aggregated store wide log, which can be found in the node that is hosting the admin service, under the KVROOT/<storename>/logs/<storename>*.log. This shows information from all the different components in the store.

Suppose Replication Node rg1-rn3, on Storage Node sn3, is not responsive. Look through the <storename>.log for entries made by those components. Each log entry is prefixed with the name of the component that issued the log message. Sometimes the aggregated store wide log has too much information, or sometimes information from a component was not transmitted to the Admin, and therefore isn't included in the aggregated log. In that case, it can be more helpful to look at the Replication Node or Storage Node logs directly, which can be found on their host, in the <KVROOT>/<storename>/logs directory.

If the problem occurs during an initial deployment, it can be particularly helpful to review the Storage Node logs to make sure that the Storage Node Agent on that node was created correctly, and that the process came up as expected, according to the installation directions.

 

Back to top

 

I see a JVM warning: Failed to reserve shared memory in my logs


By default, NoSQL DB starts up the Replication Node processes with the JVM option " -XX:+UseLargePages" to enable the large page OS option for this process. Some JVMs do not support large pages, and will issue this warning. The warning is advisory, and the store will be operational.

Why don't I see NoSQL Database's MBeans in jconsole?


First, make sure that you have enabled the JMX agent as described in Chapter 8 of the Admin Guide.

A storage node's JMX agent makes MBeans available at the Storage Node Agent's main registry port number, which is also used for all other RMI services managed by the storage node. To see the MBeans in jconsole, you have to connect jconsole to the host and port where the SNA's registry is listening. For example you could start jconsole like this:

jconsole node1.example.com:5000

Or you could start jconsole with no arguments, then choose "Remote Process" in the "New Connection" dialog, and enter "node1.example.com:5000" in address box.

 

Back to top

 

What does javax.net.ssl.SSLHandshakeException mean?


These exceptions indicate a configuration problem with a security-enabled deployment. You would see these errors when using the show plan command for a plan deploy-sn command. An exception like:

javax.net.ssl.SSLHandshakeException: Remote host closed connection during
 handshake 

indicates that the target Storage Node was not configured to be security enabled, and is not running SSL. An exception like

javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException:
 Certificate signature validation failed

indicates an incompatible SSL certificate.

To enable security, you must use the -store-security enable or -store-security configure flag for the makebootconfig utility when initially configuring the Storage Node, or by using the "security-config add-security" command to add security to an insecure installation. See the Security Guide for details.

 

Back to top

 

Oracle Open World 2014 Banner

In-Memory Replay Banner