Configuring Oracle Elastic Data

Overview

    Purpose

    Oracle Coherence is an in-memory caching solution, often referred to as an in-memory data grid.  However, what do you do when core memory is still insufficient? Oracle Coherence provides a mechanism to solve this problem—it takes advantage of high-performing secondary storage such as solid state disk or SSD. 

    Solid State or Flash Disk differ from traditional disk media in that they have no moving parts, relying on DRAM- or EEPROM-style memory to store data. As a result, SSD-based devices are incredibly fast with speeds close to or equal to traditional memory.  More information on SSD can be found here.

    Oracle Coherence can take advantage of SSD and similar devices to greatly expand the size of cache data with little or no sacrifice in speed. This feature is provided via Oracle Coherence Elastic Data, which allows SSD devices to be specified as the backing storage for a cache, thereby vastly expanding available memory.  Using Elastic Data, memory can "spill over" to an SSD device as required, increasing memory size, and returning back to core memory when usage drops. In this way, the size of the core memory becomes elastic.

    This tutorial covers the basic architecture of Elastic data, how to configure Elastic Data–aware caches, and how to control Elastic data behavior using operational configuration overrides. The tutorial uses a number of scripts and simulation Java classes to generate and interact with data in multiple caches to simulate actual program activity.  The simulation is not intended to mimic any particular business vertical but rather to provide simple interaction with Coherence caches.

    Time to Complete

    Approximately 45 minutes

    Introduction

    Oracle Coherence Elastic data requires both cache-level and operational-level configuration as well as a SSD device. A variety of optional configuration is also supported. 

    Scenario

    Coherence Elastic Data configuration requires three things:

    • A Solid State Disk used for journal data 
      Note that while any disk device can be used to test Elastic data configuration, only SSD devices should be used in production.  Configuring Elastic data to spill over to a non-SSD device could have serious effects on Oracle Coherence performance.

    • Cache configuration which defines, as part of its scheme element a backing map that includes a flash or RAM journal definition

    • Operational configuration, which defines the location and other behavior of the flash or RAM journal backing the cache


    Elastic Data Overview


    Hardware and Software Requirements

    The following is a list of hardware and software requirements:

    • Coherence 3.7.1, which can be downloaded here
    • Coherence ED OBE sample code, which can be downloaded here

    Prerequisites

    Before starting this tutorial, you should: 

    • Have a working knowledge of Java and XML syntax
    • Have access to and have already installed Coherence 3.7.1, preferably in /opt/coherence
    • Have downloaded and unpacked the ED OBE sample code, using the link above
    • Have downloaded and installed an Oracle Java JDK version 1.6.0_26 or later. The Oracle JDK kits and associated products can be found here.
      The installation of Java is outside the scope of this OBE. For more information, see the JDK download and installation pages. 

    Before running this tutorial, you should download and install Oracle Coherence 3.7.1 and the associated example source code. The following steps review the installation process of Coherence 3.7.1, as well as the expected installation locations.  Note that if you have previously installed Coherence 3.7.1, you may skip this step.

    Using the link above, accept the license agreement and download Coherence 3.7.1. Note that for this tutorial, it is assumed that files are downloaded to/tmp/.  and installed into /opt.

    Open a command prompt and change directory to the location where Oracle Coherence will be installed.

    Unzip the contents of Coherence zip using a command that is similar to:

    unzip /tmp/coherence-java-3.7.1...zip 

    Note that the exact name of the file may vary.

    For this tutorial, all files will be unzipped to /opt/coherence


    Unzip the contents of the OBE example code using a command that is similar to:
    unzip /tmp/ED.OBE.src.zip


Configuring Oracle Coherence Elastic Data

    To configure Oracle Coherence Elastic Data you must:

    1. Create a directory on an appropriate SSD device for Elastic Data use.  This directory can be monitored and examined but no files should be added or deleted from the directory while Oracle Coherence is running. 
    2. Configure Coherence caches to use a backing map scheme based on a RAM or flash journals.  The backing map scheme defines how Coherence caches use Elastic data
    3. Configure Elastic data operational information about how caches should use Elastic data, such as file sizes, directories, and other operational characteristics. This is covered in the next section
    4. Configure the environment. This step is purely for the OBE such that its scripts will run correctly and involves ensuring environment variables for the OBE and Coherence home directories are correct.
    5. Test RAM journal basics. This step tests that the RAM journal, in its default condition, functions as expected.

    Create a data directory

      Open a command prompt and change directory to the location of the installed Elastic Data OBE.
      For this tutorial, we will use /opt/ED.OBE/data for data. However, any directory on an appropriate SSD device would be be acceptable.



      Note that although a directory under /opt was used for this tutorial, in production an actual SSD device should be used.

    Configure Coherence to use a RAM journal


      Journaling refers to the technique of recording state changes in a sequence of modifications called a journal. As changes occur, a journal records each value for a specific key using a tree structure that is stored in memory, which keeps track of the most current value for that key.
      To find the value for an entry, you find the key in the tree which includes a pointer to the journal entry that contains the latest value. As changes in the journal cause values to become obsolete, stale values accumulate, which are removed at regular intervals.


      Elastic Data includes both a RAM and Flash journal implementations, designed to work seamlessly together. For example, if a RAM Journal runs out of memory, a Flash Journal automatically accepts the overflow, allowing for caches to expand far beyond the size of RAM.

      Elastic Data is configured by adding a RAM or Flash journal scheme to a backing map scheme. A flashjournal-scheme element uses a flash journal manager to determine how to write data to disk. A ramjournal-scheme is similar to a flash journal but writes to memory and then expands to disk when memory is exhausted.

      To implement Elastic data using either of these mechanisms, add a ramjournal-scheme element or a flashjournal-scheme element to a backing map scheme.  For example, to implement a RAM journal, add a ramjournal-scheme element to the backing-map-scheme.  Similarly, to implement a flash journal, add a flashjournal-scheme element to the backing map scheme.  

      For example:

           <distributed-scheme>
             <scheme-name>journaled-scheme</scheme-name>
             <backing-map-scheme>
                 <ramjournal-scheme/>
             </backing-map-scheme>
            
      <autostart>true</autostart>
           </distributed-scheme>
               
               To add journaling to an existing configuration:
             
      Open a command prompt and change directory to the location where the Elastic Data OBE was unpackaged.

      cd /opt/ED.OBE

      In your favorite editor, open config/cache-configuration.xml.

      gedit config/cache-configuration.xml

      Find the distributed-scheme element, which resembles:

        <distributed-scheme>
           <scheme-name>example-scheme</scheme-name>
           <service-name>DistributedCache</service-name>
           <backing-map-scheme>
               <local-scheme>
                   <scheme-ref>example-binary-backing-map</scheme-ref>
                </local-scheme>
           </backing-map-scheme>
           <autostart>true</autostart>
       </distributed-scheme>

      Locate the backing-map-scheme element and delete the <local-scheme>. . . </local-scheme> element and all its nested contents.  Normally, you might consider copying the existing scheme rather then changing it. 

      Note: In the example provided the scheme being used is a modified version of the example distributed scheme packaged in coherence.jar.
      Inside the backing-map-scheme element, add a ramjournal-scheme element.

      The new element should resemble:
      <backing-map-scheme/>
          <ramjournal-scheme/>
      </backing-map-scheme>


      The completed element should now resemble:
        <distributed-scheme>
          <scheme-name>example-scheme</scheme-name>
          <backing-map-scheme>
              <ramjournal-scheme/>
          </backing-map-scheme>
         
      <autostart>true</autostart>
        </distributed-scheme>

      Save your changes and exit the editor.

    Configure the environment

      A variety of scripts have been provided to simplify the testing of Elastic Data.  These scripts depend on certain environment variable settings.
      Appropriate defaults have been provided for these variables, which if correct need not be changed.

      The scripts assume:
      • Coherence has been installed at /opt/coherence
      • The OBE source has been unzipped to /opt/ED.OBE.

      If either of these values is incorrect, follow the instructions below.

      In a command prompt window, ensure that you are in the directory where the OBE source was unpacked.

      cd /opt/{obe source directory}

      In your favorite editor, open the bin/set-env.sh script.

      gedit bin/set-env.sh

      Near the top of the script are two export statements specifying the locations of both the OBE source and Coherence.
      Un-comment the two variables and set them to the appropriate values for your environment.

      Note that if the default values of /opt/coherence and /opt/ED.OBE are used, no changes are required.

      #!/bin/bash

      export OBE_HOME=/opt/my.ed.obe.location
      export COH_HOME=/opt/my.coherence.location

      Save your changes, and exit the editor

    Test the RAM journal

      A start server script has been provided, bin/start-server.sh, which can be used to start an instance of Coherence. To start a Coherence server, ensure that you are in the root of the OBE install.

      $ pwd

      /opt/ED.OBE

      Start the server instance in the background.

      $ bin/start-server.sh > server.log 2>&1 &

      This command starts the server instance in the background, redirecting both standard out and standard error to the same log file.

      Track the server start process.
      Using a command similar to tail -f {filename} , track the starting of the background coherence instance.

      $ tail -f server.log

      This command displays the last few lines of the server log as it changes. Wait until the log shows Started DefaultCacheServer and exit the command and continue.



      A simulation is provided which performs operations on several caches adding, updating and cache entities. This simulation can be used to cause the newly created Elastic Data configuration to be put into action. To manage the simulation, the bin/start-sim.sh and bin/stop-sim.sh scripts have been provided. These scripts make use of the code provided in the src/java directory.  Although beyond the scope of this tutorial, interested students are invited to examine the sources.

      Start the simulation by executing the bin/start-sim.sh command:

      $ bin/start-sim.sh

      This command starts the simulation as well as displays the process IDs of the simulations themselves.

      Note that three processes are started, each routing its output to sim.logs/sim*.log.
      Additionally, the simulations execute based on definition files provided in the sim/ directory which control duration, number of objects generated, key prefixes and the like. By default, the simulations start, generate 50,000 objects each, and run for 2 minutes before exiting.

      Examine the result of accessing the newly defined Elastic Data cache.
      Many aspects of Elastic Data, controlled by Journal Managers, can be overridden. Most common is the directory Coherence uses for Elastic Data journal files. However, if no override is provided, Coherence uses a default. The default is defined within the core Coherence operational configuration and logged as the server runs.

      In a command window, enter the following grep command:

      grep Manager server.log


      Examine the result. You will see several messages referring to default values, which we will override in the next section. 
      Of particular note is the directory used by the flash journal. Note that the directory is randomly generated.

      Before continuing, we need to stop the existing Coherence instance.
      In the command window where the original server instance was started,
      use the jobs command to determine the job ID of the running instance and use the kill command to stop the process.


      Note that the kill command did not specify a specific signal such as -9. Without a signal, the kill command signals the process to exit, but allows for cleanup. In this particular case, Coherence clears any created journal files before existing.

Testing Oracle Coherence Elastic Data

    To truly test elastic data, we must be able to control aspects of journaling behavior including directory, write size, journal file size, and similar aspects.

    In the following sections, we configure and test a the flash journal manager settings, specified via a coherence operational override, which control journaling behavior. 

    Configure the Journal Manager

      Journal Managers control a variety of aspects of Elastic Data.  Some of the most common elements of journaling-config, include:
      • directory: The location used by the Elastic Data to store journal files
      • maximum-value-size: Specifies the maximum size, in bytes, of binary values that are to be stored in the flash journal. The value cannot exceed 64 MB. Default = 64 MB.
      • block-size: The size of write buffers used for underlying writes to the disk. Should be a multiple of the disk's physical block size to optimize writes.  Must be a factor of two between 4 KB and 1MB.
      • maximum-file-size - The maximum file size of journal files, between 1 MB and 4 GB. To ensure performance, it should be a factor of 2 and a multiple of block size.

      RAM journal managers offer similar but more limited options including maximum value size and maximum-size.  For example, none of the file-based options apply to RAM journals.

      In the command window at the root of the OBE source code, copy the partially configured override file from the src/config directory to the config directory.

      $ cp src/config/coherence-override-dev.xml config/

      Note that the start scripts are already configured to use this override file, if found.

      In your favorite editor, open the override file and find the Enter journalling configuration here comment.

      gedit config/coherence-override-dev.xml

      Beneath the comment, add a journaling-config element, which contains a flashjournal-manager element.
      The flashjournal-manager element and its counterpart the ramjournam-manager element contain override values for the different journal types.

      The new element should resemble:
      <!--
        Enter journaling configuration here
      -->
      <journaling-config>
          <flashjournal-manager>
          </flashjournal-manager>
      </journaling-config>

      Flash journal elements include a variety of overrides.  Enter the following overrides in the order shown:

      Note that if the OBE is not installed in /opt/ED.OBE, ensure that the directory element is correct.

      <maximum-value-size>64KB</maximum-value-size>
      <block-size>256KB</block-size>
      <maximum-file-size>2MB</maximum-file-size>
      <directory>/opt/ED.OBE/data</directory>
      <async-limit>1MB</async-limit


      These elements define:
      • The maximum size of an element allowed in the journal
      • The write block size
      • The size of each journal file, made artificially small for example purposes
      • The directory to store journal files into
      • The asynchronous write size, set artificially small to force regular writes

      The completed element should resemble:
      <journaling-config>
          <flashjournal-manager>
              <maximum-value-size>64KB</maximum-value-size>
              <block-size>256KB</block-size>
              <maximum-file-size>2MB</maximum-file-size>
              <directory>/opt/ED.OBE/data</directory>
              <async-limit>1MB</async-limit>
          </flashjournal-manager>
      </journaling-config>

      Save your changes and exit the editor.

    Start a Coherence instance

      Start a Coherence server using a command similar to:

      $ bin/start-server.sh > server.log 2>&1 &

      Using tail confirm that the server has started.

      $ tail -f server.log

      As before, watch for Started DefaultCacheServer,   in the tail output, which signals that the server has completed initialization. Then exit the tail command and continue.

      Examine the data directory.
      Using the ls command to examine the contents of the data directory. Note that no files exist in this directory.

      ls -lha /opt/ED.OBE/data


      Until the simulation starts and objects are added to the cache, there is no need for Coherence to spill objects over.

    Start a simulation and examine the results

      The simulation is designed to run multiple processes each of which writes a pre-configured number of entries to one of several named caches.
      This behavior will cause Coherence Elastic Data to quickly exhaust the small amount of memory allocated to the Coherence Server (256 MB).

      Start the simulation using a command similar to:

      bin/start-sim.sh

      The simulation will cause a variety of activity on several different caches. 

      Examine activity in the data subdirectory using the ls command or the bin/monitor-data.sh script.

      Note that the monitor script assumes you are using the previously named directory  (/opt/ED.OBE/data), executes ls -lha, waits 1 second and repeats, for a total of 90 seconds.

      $ bin/monitor-data.sh

      The monitor script will produce an output similar to:
      . . .
      -rw-r--r-- 1 oracle staff 2.0M Jul 16 13:09 coh8739135425693253626.tmp
      . . .
      -rw-r--r-- 1 oracle staff 2.0M Jul 16 13:10 coh956962025706288357.tmp
      . . .
      In a command window, using a command that is similar to the one below, determine the IDs of the jobs. 

      jobs

      Determine the relative process IDs and use the kill command to shut down the processes.

      kill %1

      Experiment

      Different settings in the override file cause Coherence journaling to act differently.  For this example, the amount of memory associated with the Coherence instance was set very low, at 256 mb. Additionally the maximum file size for journal files was set very low.  Experiment with journal settings. Change the data directory to one which does not exist. What happens? Change the file size to a larger or smaller value. What happens to the number of files?

Summary

    In this tutorial, you have learned how to:

    • Configure caches to use journaling via flash and RAM journals
    • Fine-tune journaling behavior using journal managers
    • Test elastic data using simulations

    Resources

    • The Oracle Coherence 3.7.1 documentation, which is found here
    • Various Oracle Coherence Development and Administration classes and advance training that can be found here
    • To learn more about Oracle Coherence, see additional OBEs in the Oracle Learning Library

    Credits

    • Lead Curriculum Developer: Al Saganich
    • Other Contributors: Jason Howes, Tim Middleton, Noah Arliss, and others

To help navigate this Oracle by Example, note the following:

Hiding Header Buttons:
Click the Title to hide the buttons in the header. To show the buttons again, simply click the Title again.
Topic List Button:
A list of all the topics. Click one of the topics to navigate to that section.
Expand/Collapse All Topics:
To show/hide all the detail for all the sections. By default, all topics are collapsed
Show/Hide All Images:
To show/hide all the screenshots. By default, all images are displayed.
Print:
To print the content. The content currently displayed or hidden will be printed.

To navigate to a particular section in this tutorial, select the topic from the list.