Coherence Queries, Indexes and Explain Plans

Overview

Purpose

Prior to Coherence 3.6 developers were required to perform queries against caches using a filtering mechanism which, while concise and efficient, required a fair amount of developer effort to code correctly. Coherence 3.6, introduced the Coherence Query Language or CohQL, a lightweight SQL like language for performing select type operations on caches. Coherence 3.7 introduced additional functionality to support selections but focused on performance. The purpose of this tutorial is to examine and exercise:

Filters - How can we select a specific set of data from a cache based on a given criteria?
Coherence Query Language - How can we simplify Filters to use a select like syntax
Explain Plan - How can we estimate the cost of a filter or select?
Trace - How can we evaluate the actual cost of a filter or select?

Time to Complete

Approximately 45 minutes

Introduction

The Coherence Query Language or CohQL allows developers to create complex SQL like queries against cached data. Queries are performed in one of two ways:

Using Filters - Filters, based on the com.tangosol.util.Filter class, allow the development of queries using combinations of value extractors and combinations of and, or, not, like and similar filters to perform queries. Filters are then executed against a cache and return a result set.
Using Queries - Queries use and SQL like syntax and the import com.tangosol.util.QueryHelper class to create a Filter instance. Behind the scenes QueryHelper creates a set of Filters which match the provided select statement.

indexes

Prior to Coherence 3.7.1 there was no way to determine the performance of a query other then timing it in code or by hand, neither of which being optimal or even convenient. In addition it was impossible to determine which portions of the query had the most impact. For example did a specific and filter have more impact on the query performance then a like filter? Would an index on one of the fields changed the performance?

Coherence 3.7.1 introduced two new features to support static and dynamic evaluation of query performance. These two features are:

Explain plans - a static analysis of a query producing an estimated cost for the query
trace - a dynamic analysis of the query, producing an actual effectiveness rating for a query.

Using explain plans and trace statements developers can examine how queries are executed, in what order clauses are executed and the effect of a given index. Both of these features can be executed from both the command line, or from within an applications code. Providing an easy to use robust mechanism for examining the performance of a given query.

The purpose of this tutorial is to:

Use basic Filter and QueryHelper classes to query a cache.
Estimate the performance of a query using explain plans
Evaluate performance of a query and the impact of filters
Explore queries using the Coherence query application

Hardware and Software Requirements

The following is a list of hardware and software requirements:

Coherence 3.7.1, which can be downloaded here.
Coherence CQL sample code, which can be downloaded here.

Prerequisites

Before starting this tutorial, you should:

Have a basic working knowledge of Linux commands and command line syntax
Have access to and have already installed Coherence 3.7.1
Have downloaded and unpacked the OBE sample code. This code is designed to run on Linux and similar operating systems but can be adapted to Windows if required.
Have downloaded and installed an Oracle Java JDK version 1.6.0_26 or later. The Oracle JDK kits and associated products can be found here.
The installation of Java is outside the scope of this OBE. For more information, see the JDK download and installation pages.
Have a working knowledge of Java annotations.

Before running this tutorial, you should download and install Oracle Coherence 3.7.1 and the associated example source code. The following steps review the installation process of Coherence 3.7.1, as well as the expected installation locations. Note that if you have previously installed Coherence 3.7.1 you may skip this step.

Using the link above, accept the license agreement and download Coherence 3.7.1.
Note that for this tutorial, it is assumed that files are downloaded to/tmp/. and installed into /opt.

Open a command prompt and change directory to the location where Oracle Coherence will be installed.

Unzip the contents of Coherence zip using a command that is similar to:

unzip /tmp/coherence-java-3.7.1...zip

Note that the exact name of the file may vary.

For this tutorial, all files will be unzipped to /opt/coherence.

Unzip the contents of the OBE example code using a command similar to:
unzip /tmp/CQL.OBE.src.zip

Query a Coherence Cache using Filters and Selects

Filters

Coherence uses a filter mechanism to query cache data. Filters are similar to query statements in that a collection of operations, like, equals, greater, less and others, are combined with a set of logical operations to return a specific set of data in a cache. Filters are then used with the NamedCache.entrySet method to return results.

Examples of Filters might include (all of which can be found in the com.tangosol.util.filter.* package):

AndFilter - Returns the logical "and" of two other filters.
OrFilter - Returns the logical "or" of two other filters.
LikeFilter - Compares a value for pattern match. A pattern can include regular characters and wildcard characters '_' and '%'.
EqualsFilter - Compares a value with a constant for equality

Person

Filter greaterThen = new GreaterFilter( new ReflectionExtractor("getAge"), 16);
Filter lessThen = new LessFilter(new ReflectionExtractor("getAge"), 50));

Filter and = new AndFilter( greaterThen,lessThen);

Example

NamedCache cache = . . .;

cache.entrySet(filter,null);

null

A note on field/data access in objects

ReflectionExtractor

getFirstName

ReflectionExtractor("getAge")

Extractors package

Configure the environment

Coherence has been installed into /opt/coherence
The OBE source has been unzipped to /opt/CQL.OBE.

In a command prompt window ensure that you are in the directory where the OBE source was unpacked.

cd /opt/{obe source directory}

In your favorite editor open the bin/set-env.sh script.

gedit bin/set-env.sh

Near the top are two export statements specifying the locations of both the OBE source and Coherence.
Un-comment th two variables and set them to the appropriate values for your environment.

Note that if the default values of /opt/coherence and /opt/CQL.OBE are used no changes are required.

#!/bin/bash

export OBE_HOME=/opt/my.cql.obe.location
export COH_HOME=/opt/my.coherence.location

Save your changes, and exit the editor

Modify a class to support a Filter.

example.entity.Person.java - a representation of a person, defining characteristics of such as name, sex, age, and address
example.entity.Address.java - a representation of an address, including street, city and state

Serializable

Person

Comparable

example.test.RunQuery.java - This class provides some boiler plate code for accessing a cache and will be modified over the course of this tutorial.
example.simulation.Driver and Simulation- these two classes are used to simulation cache interaction and add, modify and delete data as they run from the 'dist-person' cache.

Open a command prompt and chance directory where the OBE source was unpacked.

Using gedit open the src/java/example/RunQuery.java class

gedit src/java/example/RunQuery.java class

Note that a completed version of this section can be found in: {OBE_HOME}/completed/1.RunQuery.java.

Find the getFilter method which resembles:

public Filter getFilter() {
    /*
     * Enter filter here
     */
    return null;
}

The filter is constructed of two numeric Filters, and an all filter which together result in the statement described above.
Directly below the comment enter greater and less filters resembling:

Filter lower = new GreaterFilter(new ReflectionExtractor("getAge"), 16);
Filter upper = new LessFilter(new ReflectionExtractor("getAge"), 50);

The partially completed code should resemble:

**Note** that the required imports have been added for you. All filters can be found in the com.tangosol.util.filter.* package.

Beneath the second filter statement add an and statement, combining the age limits. The new statement should resemble:

Filter and = new AndFilter(lower,upper);

Beneath the and filter, add a statement to filter by name. The statement should resemble:

Filter name = new LikeFilter("getFirstName", "Al%");

An AllFilter can be used to combine the entire result into a single statement. Add an AllFilter and return the final result from the method. The statements should resemble:

Filter all = new AllFilter(new Filter[] { and,name });
return all;

Save your changes and exit the editor.

Build and test the Filters

bin/build.sh- Recompiles and repackages all source code.
bin/start-sim.sh-Starts a simulation which generates and adds Person objects to a cache, then randomly changes the objects over time.
bin/stop-sim.sh- Stops the simulation scripts.
bin/check-sim.sh-Reports whether the PIDs associated with simulation processes are running..
bin/runquery.sh -Runs the query test application.

In the command window rebuild the sources using the provided build script. Correct any compile errors.
bin/build.sh
The command should produce output similar to:

Start the simulation using a command similar to:

bin/start-sim.sh
The command should produce output similar to:

Give the simulation at least one minute or more to generate data and then examine the output of the started simulation for errors using a command similar to:
tail -f sim.0.log
The output should resemble that shown below. Note that the simulation will randomly generate Person objects so the exact content will vary.

Run the updated query application.

bin/runquery.sh > query.log 2>&1

If the query has been correctly specified the command should produce output similar to those shown below. Use less, gedit or a similar tool to examine the results:

Modify query to use CohQL.

NamedCache

select - select {field set} from "{cache}" where {where clause}.
QueryHelper - com.tangosol.util.QueryHelper, a class which provides a set of factory methods which can be used to generate instances of Filter objects.
For more information on QueryHelper see here.

Open the query application.

gedit src/java/example/RunQuery.java class

Note that a completed version of this section can be found in:

completed/2.RunQuery.java

Find the previously updated getFilter method and delete the contents.

Create a where clause statement to search for any Person objects with first name starting with Al and age > 16 and < 50.

The updated method should resemble:

public Filter getFilter() {
String where = "firstName like 'Al%' and (age > 16 and age < 50)";

}

Complete the method by adding statement to return a filter based on the where clause using QueryHelper.
Note that the appropriate import com.tangosol.util.QueryHelper has been added for you.

Filter filter = QueryHelper.createFilter(where); return filter;

The completed method should resemble:

Save your changes and exit the editor

In the command window rebuild the sources using the provided build script.

bin/build.sh

Correct any compiler errors if required.

**[optional]** Confirm that the simulation is still running.
The simulation uses a set of configuration files which are define how the simulation runs, including how long to run.
By default the simulation runs for 60 minutes. You can confirm that the simulation is running using the check-sim.sh script which examines the set of know PIDs for the simulation and checks that they are executing. To execute the script enter:

bin/check-sim.sh

Which will produce output similar to:

If the simulations have completed the command will produce output similar to:

In which case simply restart the simulation using the **bin/start-sim.sh** command

Run the query application.

bin/runquery.sh > query.log 2>&1

If the query has been correctly specified the command should produce results virtually identical to the past section.
Remember that data is generated and constantly changing so the results may not match exactly.

Working with Explain plan and Trace Statements

Explain Plan and Trace

Applications can profile queries using the com.tangosol.util.aggregator.QueryRecorder class, documentation for which can be found here.

The process works by creating a QueryRecorder instance and then is submitted to a cache for execution using the NamedCache.aggregate(filter,agent) method.

The method examines the provided filter on all members of the cluster and returns a QueryResult record, which can be examined to understand how the query performed.

Example

QueryRecorder.RecordType

Imports

import com.tangosol.util.aggregator.QueryRecorder;
import static com.tangosol.util.aggregator.QueryRecorder.RecordType;

static

RecordType

Create and execute the recorder:

QueryRecorder agent = new QueryRecorder(RecordType.EXPLAIN);
Object resultsExplain = cache.aggregate(filter, agent);

Output the results:

System.out.println("\n" + resultsExplain + "\n");

Examine and evaluate an Explain Plan:

Name                                  Index        Cost      
======================================================================================
com.tangosol.util.filter.AndFilter  | ----       | 0         
  LikeFilter(.getFirstName(), Al%)  | 0          | 2300000         
  com.tangosol.util.filter.AndFilte | ----       | 0         
    GreaterFilter(.getAge(), 16)    | 1          | 2300000         
    LessFilter(.getAge(), 50)       | 1          | 2300000         


Index Lookups
Index  Description                               Extractor             Ordered   
======================================================================================
0      No index found                            .getFirstName()       false
1      No index found                            .getAge()             false

Name – This column shows the name of each filter in the query. Composite filters show information for each of the filters within the composite filter.
Index – This column shows whether or not an index can be used with the given filter. If an index is found, the number shown corresponds to the index number on the Index Lookups table shown beneath the cost table.
Cost – This column shows an estimated cost of applying the filter.
If an index can be used, the cost is given as 1 becayse the operation of applying the index requires just a single access to the index content.

In the example, there are 3 storage-enabled cluster members and thus the cost reflects accessing the index on each of the three members. If no index exists, the cost is calculated as EVAL_COST * number of keys where EVAL_COST value is a constant value= 1000. Each row is intended to show the relative cost of doing a full scan to reduce the key set using the filter.

In the example, there are ~1500 cache entries which need to be evaluated. Querying indexed entries is always relatively inexpensive as compared to non-indexed entries but does not necessarily guarantee effectiveness.

Examine and evaluate a Query Trace:

Trace differs from Explain Plans in that the query is run and timed for the provided dataset and each index evaluated for a relative effectiveness.

RecordType.TRACE

RecordType.EXPLAIN

QueryRecorder agent = new QueryRecorder(RecordType.TRACE);

Trace
Name                                  Index        Effectiveness          Duration  
======================================================================================
com.tangosol.util.filter.AndFilter  | ----       | 12852|1386(89%)      | 134       
  LikeFilter(.getFirstName(), Al%)  | 0          | 12852|2142(83%)      | 85        
  com.tangosol.util.filter.AndFilte | ----       | 2142|1386(35%)       | 42        
    GreaterFilter(.getAge(), 16)    | 1          | 2142|2058(3%)        | 17        
    LessFilter(.getAge(), 50)       | 1          | 2058|1386(32%)       | 24        


Index Lookups
Index  Description                               Extractor             Ordered   
======================================================================================
0      SimpleMapIndex: Extractor=.getFirstName(  .getFirstName()       true
1      SimpleMapIndex: Extractor=.getAge(), Ord  .getAge()             true

++++++++

Name – This column shows the name of each filter in the query. Composite filters show information for each of the filters within the composite filter.
Index – This column shows whether or not an index can be used with the given filter. If an index is found, the number shown corresponds to the index number on the Index Lookups table.
Effectiveness – This column shows the amount a key set was actually reduced as a result of each filter. The value is given as prefilter key set size | postfilter key set_size and is also presented as a percentage. The prefilter key set size value represents the key set size prior to evaluating the filter or applying an index. The post filter key set size value represents the size of the key set remaining after evaluating the filter or applying an index. For a composite filter entry, the value is the overall results for its contained filters. Once a key set size can no longer be reduced based on an index, the resulting key set is deserialized and any non index filters are applied.
Duration – This column shows the number of milliseconds spent evaluating the filter or applying an index. A value of 0 indicates that the time registered was below the reporting threshold.

Using Explain Plans

RunQuery.java

Open the query application.

gedit src/java/example/RunQuery.java class

Note that a completed version of this section can be found in:

completed/3.RunQuery.java

Find the executeQuery method which resembles:

/**
* Execute the provided query against the named cache.
*/
public void executeQuery() {

    dumpReturnedEntries(getNamedCache(), getFilter());

    /*
     * Enter trace and/or explain plan code here.
     */
}

Immediately following the comment add to obtain an instance of the cache and filter used in the query.

Note: The getNamedCache() and getFilter() were provided/previously created.

The additional code should resemble:

NamedCache cache = getNamedCache();
Filter filter = getFilter();

Add code to obtain an instance of a query recorder and execute it,

The additional code should resemble:

QueryRecorder agent = new QueryRecorder(RecordType.EXPLAIN);
Object results = cache.aggregate(filter, agent);

Note: The required imports have been added for you and are shown below.

import com.tangosol.util.aggregator.QueryRecorder;
import static com.tangosol.util.aggregator.QueryRecorder.RecordType;

Lastly add code to print the results: The statement should resemble:

System.out.println("\n" + results + " \n");

The completed method should resemble:

Build and execute the updates.
Check that the sim is running

bin/check-sim.sh

Build the code.

bin/build.sh

Correct any compiler errors and run the query.

bin/runquery.sh > query.log 2>&1

Using gedit or less examine the output, search for the **Explain Plan**.

The result should resemble:

Add indices and examine results

Open the query application.

gedit src/java/example/RunQuery.java class

Note that a completed version of this section can be found in:

completed/4.RunQuery.java

Find the executeQuery method and inside it, before the QueryRecorder add code to index the Age and FirstName fields.
The new code should resemble:

cache.addIndex(new ReflectionExtractor("getAge"), true, null);
cache.addIndex(new ReflectionExtractor("getFirstName"), true, null);

**[optional]** Add code to remove the indexes.

Following the System.out line, add code to remove the indexes, the remove index code should resemble:

cache.removeindex(new="" reflectionextractor("getfirstname"));
cache.removeIndex(new ReflectionExtractor("getAge"));
The updated method should resemble:

Save the changes and exit the editor.

Build and execute the updates
Check that the sim is running

bin/check-sim.sh

Build the code.

bin/build.sh

Correct any compiler errors and run the query.

bin/runquery.sh > query.log 2>&1

Examine the results.

Using gedit or less examine the output, search for the **Explain Plan**. Note that the cost of the query has dropped dramatically.

The result should resemble:

Using Trace vs Explain Plan

RunQuery.java

Open the query application.

gedit src/java/example/RunQuery.java class

Note that a completed version of this section can be found in:

completed/5.RunQuery.java

Find the executeQuery method and change the query recorder to type RecordType.TRACE
When complete the method should resemble:

QueryRecorder agent = new QueryRecorder(RecordType.TRACE);

Save your changes and exit the editor.

Build and execute the updates.
Check that the sim is running, ensure that the sim has been running for a few minutes.

bin/check-sim.sh

Build the code.

bin/build.sh

Correct any compiler errors and run the query.

bin/runquery.sh > query.log 2>&1

Examine the output of the TRACE, an example of which is shown below.

**Question**: How effective is the use of the age index?

[Optional] Remove the age index. Re-build and re-run the query. What was the result?

Using the Coherence Query Command Line Tool

com.tangosol.coherence.dslquery.QueryPlus

QueryPlus

select - select from a cache
insert - insert an element into a cache
delete - delete an element from a cache
create index - create an index on a cache
drop index - Remove an index from a cache
And others

here

In the command prompt start the query plus application using the provided bin/querytool.sh

bin/querytool.sh
**Note:** Use the bin/check-sim.sh script if needed to ensure the simulation is running
You should be presented with a **CohQL>** prompt similar to that shown below:

**Note:**Command line editing is enabled, so you can use the arrow keys to retrieve or edit prior commands.

At the CohQL> prompt enter:

select * from "dist-person" [CR]

Note: [CR] represents a carriage return and is not to be entered.

A set of Person objects will be returned.

Select only first name, last name and age from the cache

select FirstName, LastName, Age from "dist-person"

Add a where clause for age > 15 and < 50 and name like Al

select FirstName,LastName,Age from "dist-person" where age > 15 and age < 50 and FirstName like "Al%"

Using the trace command trace the execution

trace select FirstName,LastName,Age from "dist-person" where age > 15 and age < 50 and FirstName like "Al%"

Using the explain plan for command to examine the execution of the command

explain plan for select FirstName,LastName,Age from "dist-person" where age > 15 and age < 50 and FirstName like "Al%"

Using the create index add an index on first name

create index "dist-person" FirstName;

Rerun the trace and explain plan for commands and evaluate the addition of the index

trace select FirstName,LastName,Age from "dist-person" where age > 15 and age < 50 and FirstName like "Al%"

explain plan for select FirstName,LastName,Age from "dist-person" where age > 15 and age < 50 and FirstName like "Al%

Using the drop index drop theindex on first name

drop index "dist-person" FirstName

Exit Query Plus

quit

Summary

In this tutorial, you have learned:

Use basic Filter and QueryHelper classes to query a cache.
Estimate the performance of a query using Explain Plans
Evaluate performance of a query and the impact of filters
Explore queries using the Coherence query application

Resources

The Oracle Coherence 3.7.1 documentation, which is found here
Various Oracle Coherence Development and Administration classes and advance training that can be found here
To learn more about Oracle Coherence, additional OBEs in the Oracle Learning Library

Credits

Lead Curriculum Developer: Al Saganich
Other Contributors: Jason Howes, Tim Middleton, Noah Arliss, and others

To help navigate this Oracle by Example, note the following:

Hiding Header Buttons:: Click the Title to hide the buttons in the header. To show the buttons again, simply click the Title again.
Topic List Button:: A list of all the topics. Click one of the topics to navigate to that section.
Expand/Collapse All Topics:: To show/hide all the detail for all the sections. By default, all topics are collapsed
Show/Hide All Images:: To show/hide all the screenshots. By default, all images are displayed.
Print:: To print the content. The content currently displayed or hidden will be printed.

To navigate to a particular section in this tutorial, select the topic from the list.