Java Magazine Logo
Originally published in the July/August 2014 issue of Java Magazine. Subscribe today.

Introduction to the Java Temporary Caching API

by Johan Vos

Published July 2014

Use a caching strategy without worrying about implementation details.

Most enterprise and web applications use some method of caching. If an application receives a large number of requests, it is often beneficial to store some of the responses in a cache. Then, under certain conditions, the same response can be sent to answer similar requests. Rather than redoing all the processing and computations, returning the previously cached response can save lots of CPU resources.

A number of commercial solutions provide cache implementations to developers. Different solutions often come with different topologies, concepts, and features. Fortunately, the Java Temporary Caching API allows Java developers to use a common approach for working with caches, without having to worry about implementation details. This article provides an introduction on how to leverage the Java Temporary Caching API in Java applications.

Java specifications are defined by the Java Community Process (JCP) organization. For each specification, a Java Specification Request (JSR) is submitted. The development of Java specifications follows a number of well-defined steps, starting with the formation of an Expert Group and ultimately resulting in the availability of a specification, a Reference Implementation, and a test suite.

Sometimes, the transition between the formation of the Expert Group and the final approval of the specification happens very quickly. In other cases, it takes a bit longer. The Java Temporary Caching API, JSR 107, was considered an important area for standardization in 2001. However, it wasn’t until March 18, 2014, that the specification was finally released.

The good thing about the long time between the start of the standardization effort and the final release is that all the experience gathered by the Expert Group members and the community was taken into account in the specification. Although the terminology slightly varies among the different producers of cache software, there are enough common concepts that are now captured in JSR 107.

While caching is often linked to database calls—that is, the result of a database call is often stored in a cache—it should be stressed that caching concepts are much broader than database applications. Indeed, a cache can also be used to store images, responses from web services, and so on. The most generic representation of an entry that can be cached, as defined by JSR 107, is a Java object.

In this article, we will see how to use the Java Temporary Caching API with the Amazon DynamoDB service. However, the same concepts apply to most relational and nonrelational databases, as well as to any services that result in a Java object being produced.

Note: The source code for the sample application developed in this article can be downloaded here.

Amazon DynamoDB

Amazon DynamoDB is a cloud-based NoSQL data store provided by Amazon. An overview of the DynamoDB features is outside the scope of this article, and the Java Temporary Caching API can be demonstrated with other NoSQL or SQL data storage systems as well.

One of the nice things about DynamoDB is the availability of a local version that can easily be used by developers during testing. We will use this local version in our examples. DynamoDB Local can be downloaded by following these instructions.

The downloaded Java archive (JAR) file can easily be started using the command shown in Listing 1.

java -Djava.library.path=./DynamoDBLocal_lib 
-jar DynamoDBLocal.jar –inMemory

Listing 1

This command will start the local version of DynamoDB and cause it to run in memory instead of using a database file. As a consequence, all data supplied to DynamoDB Local will be kept in memory and will be lost when the server is stopped. This is by no means how the cloud version of DynamoDB works. However, the APIs required for accessing the local version and the cloud version are the same. The cloud offering requires more configuration and, unlike the local version, is not a free service. Therefore, DynamoDB Local is very popular with software engineers during the code development process.

The Amazon Web Services SDK (AWS SDK) contains a number of APIs that allow developers to communicate with DynamoDB. In our examples, we will use the high-level API, which provides a direct mapping between Java objects and entries in DynamoDB. In order to do so, we will use a few annotations on the Java objects we want to store.

The Sample Application

While most of the applications that will leverage the Java Temporary Caching API are probably enterprise applications, the API itself is designed to work in a Java SE environment as well. For simplicity, we will write our examples on top of the Java SE 8 platform.

Our sample application will create a number of Person instances. The code for the Person class is shown in Listings 2a and 2b. The Person class is a typical Java class containing fields for the first name, the last name, and the age of a person. Additionally, we have a myKey field that will be used for storing a primary key.

package org.lodgon.dynamocache;

import com.amazonaws.services.dynamodbv2.datamodeling.
DynamoDBHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.
DynamoDBTable;
import java.io.Serializable;

@DynamoDBTable(tableName = "Person")
public class Person implements Serializable {

    private long myKey;
    private String firstName;
    private String lastName;
    private int age;
    
    public Person() {}
    
    public Person (Long key, String f, String l, int a) {
        this.firstName = f;
        this.lastName = l;
        this.age = a;
        myKey = key;
    }     
    
    public void setMyKey (long l) {
        this.myKey = l;
    }
    
    @DynamoDBHashKey
    public long getMyKey() {
        return myKey;
    }

Listing 2a

  public String getFirstName() {
        return firstName;
    }

    public void setFirstName(String firstName) {
        this.firstName = firstName;
    }

    public String getLastName() {
        return lastName;
    }

    public void setLastName(String lastName) {
        this.lastName = lastName;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
    
}

Listing 2b

The annotation @DynamoDB Table(tableName = "Person") is used to associate the Person class with the “Person” table that we will create in DynamoDB. Further, the @DynamoDBHashKey annotation indicates that the annotated method (getMyKey) returns the hash key for the object in DynamoDB.

The examples we will create will perform three steps:

  1. Create a datastore.
  2. Populate the datastore.
  3. Query the datastore.

The main method for our sample application is very simple (see Listing 3).

public static void main(String[] args) {
        createDatabase();
        populateDatabase();
        queryDatabase();
    }

Listing 3

In our first example, we will not use any caching. We will write all data directly to DynamoDB, and when querying data, we will directly query DynamoDB as well.

The createDatabase function will create the DynamoDB datastore. The code in Listing 4 demonstrates how a DynamoDB datastore is created. We won’t go into DynamoDB-specific details, but on a high level, the createDatabase call does the following:

  • Creates credentials (key and secret) for communicating with Amazon DynamoDB. For the DynamoDB Local version, these credentials don’t matter, although they have to be supplied.
  • Creates an AmazonDynamoDBClient and a DynamoDBMapper instance. We will use the DynamoDBMapper instance later.
  • Defines the key that will be used as the primary key for indexing the data.
  • Specifies the requested throughput (for read and write operations). Again, these values are irrelevant for the DynamoDB Local version, but they have to be provided.
  • Deletes the table “Person,” if it exists.
  • Creates the table “Person.” 
 private static void createDatabase() {
        AWSCredentials credentials = new BasicAWSCredentials(
"", "");
        client = new AmazonDynamoDBClient(credentials);
        mapper = new DynamoDBMapper(client);
        ArrayList<AttributeDefinition> attributeDefinitions = 
new ArrayList<>();
        attributeDefinitions.add(
new AttributeDefinition().withAttributeName(
"myKey").withAttributeType("N"));
        client.setEndpoint("http://localhost:8000");
        KeySchemaElement keySchemaElement = 
new KeySchemaElement("myKey", KeyType.HASH);
        List<KeySchemaElement> list = new LinkedList<>();
        list.add(keySchemaElement);
        ProvisionedThroughput tp = 
new ProvisionedThroughput(1L, 1L);
        CreateTableRequest ctr = new CreateTableRequest(
attributeDefinitions, "Person", list, tp);
        ListTablesResult listTables = client.listTables();
        if (listTables.getTableNames().contains("Person")) {
            client.deleteTable("Person");
        }
        CreateTableResult createTableResult = 
client.createTable(ctr);
    }

Listing 4

Now that a table has been created, and a DynamoDBMapper object has been constructed, we can populate the table.

Not just Java EE

While most of the applications that will leverage the Java Temporary Caching API are probably enterprise applications, the API itself is designed to work in a Java SE environment as well.

We will create 1000 instances of Person with random content. An index ranging from 0 to 999 will be used to set the myKey field for these instances. The code in Listing 5 populates the table with these instances. Using the DynamoDB high-level API, storing an object is as simple as calling the save method on a DynamoDBMapper instance and providing the object that we want to store.

private static void populateDatabase() {
        Random age = new Random();
        for (long i = 0; i < 10000; i++) {
            String f1 = UUID.randomUUID().toString();
            String f2 = UUID.randomUUID().toString();
            Person person = new Person(
i, f1, f2, age.nextInt(100));
            mapper.save(person);
        }
    }

Listing 5

We will now query DynamoDB to ask for Person instances with a specific key, as shown in Listing 6. We will make 1000 requests to the DynamoDBMapper. Each request asks the DynamoDBMapper to find the Person instance corresponding to a specific key. The key is a random value between 0 and 1099, whereas the keys of the stored Person instances range between 0 and 999. As a consequence, we expect about 10 percent of the requests to result in an empty answer.

private static void queryDatabase() {
        Random random = new Random();
        int found = 0;
        int notfound = 0;
        for (int i = 0; i < 1000; i++) {
            int rnd = random.nextInt(1100);
            Person hashKeyValues = new Person();

            hashKeyValues.setMyKey(rnd);
            DynamoDBQueryExpression<Person> queryExpression = 
new DynamoDBQueryExpression<Person>()
                    .withHashKeyValues(hashKeyValues);

            queryExpression.setHashKeyValues(hashKeyValues);

            List<Person> itemList = mapper.query(
Person.class, queryExpression);
            if (itemList.size() == 0) {
                notfound++;
            } else {
                found++;
            }
        }
    }

Listing 6

Introducing the Java Temporary Caching API

Until now, all requests for storing and retrieving data have gone directly to the DynamoDB system. At this point, we will introduce the Java Temporary Caching API. Before we look at the code, here are the main classes defined by JSR 107:

  • CacheProvider, which contains an implementation of the Java Temporary Caching API
  • CacheManager
  • Cache
  • Entry
  • Expiry

We will now modify the code used to populate the datastore and introduce the caching features. The modified code is shown in Listing 7. Before we populate the datastore, we create a Cache instance. First, we have to obtain a reference to the CachingProvider. This is done by the code shown in Listing 8.

private static void populateDatabaseCache() {

        CachingProvider cachingProvider = 
Caching.getCachingProvider();
        CacheManager cacheManager = 
cachingProvider.getCacheManager();

        MutableConfiguration<Long, Person> config
                = new MutableConfiguration<Long, Person>()
                .setTypes(Long.class, Person.class)
                .setExpiryPolicyFactory(
AccessedExpiryPolicy.factoryOf(ONE_HOUR))
                .setStatisticsEnabled(true);

        cache = cacheManager.createCache("personCache", config);

        Random age = new Random();
        DynamoDBMapper mapper = new DynamoDBMapper(client);
        for (long i = 0; i < 1000; i++) {
            String f1 = UUID.randomUUID().toString();
            String f2 = UUID.randomUUID().toString();
            Person person = new Person(
i, f1, f2, age.nextInt(100));
            mapper.save(person);
            cache.put(person.getMyKey(), person);
        }
    }

Listing 7

CachingProvider cachingProvider = 
Caching.getCachingProvider();

Listing 8

The Java Temporary Caching API is defined by JSR 107, but the API itself does not contain a concrete implementation. At runtime, the API will use the Java ServiceLoader to check whether a CachingProvider implementation is available on the classpath.

In our example, we use the Reference Implementation of JSR 107, which is available in Maven Central. The JAR file containing the Reference Implementation contains a META-INF/services directory containing a reference to the CachingProvider implementation.

If another JSR 107–compliant implementation is available on the classpath, that implementation will be used. Our code doesn’t have to change, though, because we are using only the APIs provided by the JSR 107 specification. We are not using implementation-specific APIs or features.

Once we have a CachingProvider, we need to get a CacheManager. This is done using the code shown in Listing 9.

CacheManager cacheManager = 
cachingProvider.getCacheManager();

Listing 9

Now that we have a CacheManager instance, we can create a Cache instance. Our cache needs to hold instances of Person. A Cache contains entries that have a key and a value. In our case, it makes sense to use the myKey field of the Person as the key and to use the Person itself as the value. The code in Listing 10 creates our cache and provides the required configuration.

Common Approach

The Java Temporary Caching API allows Java developers to use a common approach for working with caches, without having to worry about implementation details.

The configuration of the cache is defined in a MutableConfiguration instance. The MutableConfiguration class allows you to set a number of configuration policies by using the fluent API . The fluent API implies that the result of a set method is the instance itself.

In our example, we set the following:

  • The setTypes statement in Listing 10 specifies that the key for our cache is of type Long, and the value is of type Person.

     

    MutableConfiguration<Long, Person> config
                    = new MutableConfiguration<Long, Person>()
                    .setTypes(Long.class, Person.class)
                    .setExpiryPolicyFactory(
    AccessedExpiryPolicy.factoryOf(ONE_HOUR))
                    .setStatisticsEnabled(true);
    
            cache = cacheManager.createCache("personCache", config);
    

    Listing 10

  • The setExpiryPolicyFactory statement defines how long cache entries are valid.

In order to create a Cache instance, we call the createCache method on the CacheManager instance and supply the name of the cache (personCache) and the configuration object. 

Apart from creating the cache, we add only one line of code. In the for loop where the Person instances are created, we add the following line:

cache.put(person.getMyKey(), 
person);


This line actually stores the Person instance in the cache. However, a typical issue with caches is that you never know for sure whether the data you added to the cache is still in the cache. In our configuration, we explicitly stated that the cache entries should be in the cache for at most one hour. However, the cache implementation might decide to remove entries from the cache sooner than that, for example, when the cache reaches a critical size. The maximum cache size can be very complex, depending on the implementation architecture. Explicitly setting the maximum cache size often doesn’t make sense, because there are many internal and external parameters that define the optimal cache size. Hence, JSR 107 does not provide a method for setting the maximum cache size.

When retrieving data from the cache, we have to take into account that the cache might not contain the data anymore. In Listing 6, we queried DynamoDB and asked whether it has a Person object corresponding to a specific key. That key is a Long value between 0 and 1099. If the key is 1000 or more, DynamoDB will tell us it can’t find a corresponding value, and then we know that there is no Person associated with this key. However, we can’t apply the same logic when using the cache. It is very possible that the cache doesn’t contain a Person instance belonging to a specific key, but DynamoDB might have this Person instance. This will be the case for Person instances that are evicted from the cache.

We ask the cache for a specific Person instance using the following method:

Person p = cache.get(KEY)


If this method returns a Person instance, we’re done. If this method returns null, however, the DynamoDB datastore might still have the Person instance. In that case, we apply the same code as in Listing 6.

Using Read-Through and Write-Through Caches

So far, we used the cache and DynamoDB independent from each other. We stored the Person instances in the cache as well as storing them manually in DynamoDB. And when retrieving data, we first asked the cache, and only when that failed did we query DynamoDB.

Typically, caches also allow read-through and write-through operations. Simply stated, in a write-through operation, data that is written to a cache is also made persistent (for example, via DynamoDB), and in a read-through operation, data read from the cache is retrieved from persistent storage if it isn’t in the cache.

An advantage of this approach is that the application code does not need to query the persistent storage itself each time the cache fails to return the requested object.

JSR 107 allows developers to leverage this read-through and write-through functionality. We will modify the previous example and show how we can achieve this.

Enabling write-through and read-through functionality is part of the configuration process. The MutableConfiguration object that we used before to specify the configuration of the cache can also be used to specify the write-through and read-through behavior. Specifying write-through is done as shown in Listings 11a and 11b.

config.WriteThrough(true);
config.setCacheWriterFactory(new Factory<CacheWriter< Long, Person>>() {
   @Override
   public CacheWriter<Long, Person> create() {
      CacheWriter<Long, Person> writer = new CacheWriter<Long, Person>() {

         @Override
         public void write(Cache.Entry<? extends Long, ? extends Person> entry) 
            throws CacheWriterException {
            mapper.save(entry.getValue());
            }

         @Override
         public void writeAll(Collection<Cache.Entry<? extends Long, ? extends Person>> 
            clctn) throws CacheWriterException {
            clctn.forEach((Cache.Entry ce) -> mapper.save(ce.getValue()));
         }

Listing 11a

@Override
         public void delete(Object o) throws CacheWriterException {
            mapper.delete(o);
         }

         @Override
         public void deleteAll(Collection<?> clctn) throws CacheWriterException {
            clctn.forEach((Object o) -> mapper.delete(o));
         }

      };
      return writer;
   }
})

Listing 11b

As you can see, we use the MutableConfiguration.setCache WriterFactory() method to provide a Factory for a CacheWriter. This Factory provides a CacheWriter class. We create our own extension of the CacheWriter class, and we map the methods in this CacheWriter to API calls to DynamoDB. Each time an entry is added to, updated in, or deleted from the cache, the CacheWriter methods are called. As a consequence, we can execute all storage tasks on the cache only. Thanks to the CacheWriterFactory, changes in the cache will be written to DynamoDB as well.

Similar to a CacheWriterFactory, we can also specify a CacheLoaderFactory. The CacheLoader that is returned by the CacheLoaderFactory is responsible for querying DynamoDB if an entry is requested from the cache but is not found there. In this case, the overridden implementations from the CacheLoader will send queries to DynamoDB.

The code in Listing 12 adds the read-through behavior on the MutableConfiguration instance we used before. As you can see, the load method in the CacheLoader does exactly what we did in Listing 6. It creates a query for DynamoDB and tries to retrieve the Person using the provided key.

config.setReadThrough(true)
config.setCacheLoaderFactory(new Factory<CacheLoader<Long, Person>>() {
  public CacheLoader<Long, Person> create() {
    return new CacheLoader<Long, Person>() {
      public Person load(Long k) throws CacheLoaderException {
        Person hashKeyValues = new Person();
        hashKeyValues.setMyKey(k);
        DynamoDBQueryExpression<Person> queryExpression = new DynamoDBQueryExpression<Person>()
        .withHashKeyValues(hashKeyValues);
        List<Person> itemList = mapper.query(Person.class, queryExpression);
        if (itemList.size() == 0) {
          return null;
        } else {
          return itemList.get(0);
        }
      }
      public Map<Long, Person> loadAll(Iterable<? extends Long> itrbl) throws CacheLoaderException {
        Map<Long, Person> answer = new HashMap<>();
        itrbl.forEach((Long k) -> answer.put(k, load(k)));
        return answer;
      }
    };
  }
})

Listing 12

Using the CacheLoader—and, hence, leveraging the read-through behavior—often makes it easier for developers to focus on the application logic rather than on the cache logic. Without read-through, a developer has to query the persistent storage every time a query on a cache returns no result. By using a CacheLoader, that behavior is delegated to the CacheLoader, and it needs to be coded only once.

Learn More


 Amazon DynamoDB

 JSR 107

 Specification and Reference Implementation

Conclusion

In this article, we only scratched the surface of JSR 107. The specification provides a common approach to functionality offered by most caching providers. We covered only simple get and store operations, and we briefly touched on the concepts of write-through and read-through. There is much more to discover, and the interested reader is referred to the Javadoc for more information.

The most important achievement of JSR 107 is that application developers can now use a caching strategy in their applications, independent of a specific implementation. Some providers of cache software already have a JSR 107–compliant implementation, and it is expected that more will follow. During development, developers are encouraged to use only the functionality provided by JSR 107. At runtime, a concrete implementation (free or commercial) is added to the classpath, and it will automatically be used.


vos-headshot



Johan Vos
started working with Java in 1995. He is a cofounder of LodgON, where he is working on Java-based solutions for social networking software. An enthusiast of both embedded and enterprise development, Vos focuses on end-to-end Java using JavaFX and Java EE.