SPECjms2007: A Novel Benchmark and Performance Analysis Framework for Message-Oriented Middleware
We now briefly discuss the way the workload scenario we presented in the previous sections has been implemented as part of the SPECjms2007 benchmark.
Event Handlers and Agents
SPECjms2007 is implemented as a Java application comprising multiple JVMs and threads distributed across a set of client nodes. For every destination (queue or topic), there is a separate Java class called Event Handler (EH) that encapsulates the application logic executed to process messages sent to that destination. Event handlers register as listeners for the queue/topic and receive call backs from the messaging infrastructure as new messages arrive.
For maximal performance and scalability, multiple instances of each event handler executed in separate threads can exist and they can be distributed over multiple physical nodes. Event handlers can be grouped according to the physical location (e.g. HQ, SM, DC or SP) they pertain to in the business scenario.
In addition to the event handlers, for every physical location, a set of threads is launched to drive the benchmark interactions that are logically started at that location. These are called driver threads. The set of all event handlers and driver threads pertaining to a given physical location is referred to as agent. For example, each DC agent is comprised of a set of event handlers for the various destinations inside the DC and a set of driver threads used to drive Interaction 2, which is the only interaction with logical starting point at DCs.
An important goal of SPECjms2007 that we discussed earlier was to provide a flexible framework for performance analysis of MOM servers that allows users to configure and customize the workload according to their requirements. To achieve this goal, the interactions have been implemented in such a way that one could run them in different combinations depending on the desired transaction mix. Configurability is provided along the following dimensions:
SPECjms2007 offers three different ways of structuring the workload: horizontal, vertical and freeform. The latter are referred to as workload topologies and they correspond to three different modes of running the benchmark offering different level of configurability.
The horizontal and vertical topologies represent two strategies for scaling the supermarket supply chain scenario - the first one by increasing the number of physical locations and the second one by increasing the amount of traffic per physical location. The horizontal topology is meant to exercise the ability of the system to handle increasing number of destinations. To this end, the workload is scaled by increasing the number of physical locations (SMs, DCs, etc.) while keeping the traffic per location constant.
The vertical topology, on the other hand, is meant to exercise the ability of the system to handle increasing message traffic through a fixed set of destinations. Therefore, a fixed set of physical locations is used and the workload is scaled by increasing the rate at which interactions are run. Finally, the freeform topology allows the user to use the seven SPECjms2007 interactions as building blocks to design his own workload scenario which can be scaled in an arbitrary manner by increasing the number of physical locations and/or the rates at which interactions are run. Table 1 below shows the workload parameters that can be configured in the three topologies:
While in the horizontal and vertical topologies there are some restrictions as to which of the above parameters can be set, no restrictions apply to the freeform topology. Most importantly, the user can selectively turn off interactions or change the rate at which they are run to shape the workload according to his requirements. At the same time, when running the horizontal or vertical topology, the benchmark behaves as if the interactions were interrelated according to their dependencies in the real-life application scenario.
The paper Workload Characterization of the SPECjms2007 Benchmark published in the Proceedings of the 4th European Performance Engineering Workshop (EPEW-2007) provides a comprehensive workload characterization of SPECjms2007. The benchmark workload is characterized in terms of the number and types of destinations (queues and topics), the interaction mix, the message types, the message sizes and the message delivery modes. The different types of messages and destinations used in the various interactions are detailed in Table 2.
The detailed message throughput analysis presented in the paper serves two main purposes. First, using the information provided, the user can assemble a workload configuration (in terms of number of locations and interaction rates) that stresses specific types of messaging under given scaling conditions. This allows the user to construct his own custom workload using the SPECjms2007 interactions as building blocks. As a very basic example, the user might be interested in evaluating the performance and scalability of non-persistent pub/sub messaging under increasing number of subscribers. In this case, a mix of Interactions 6 and 7 can be used with increasing number of SMs. Second, the characterization of the message traffic on a per location basis can help users to find optimal deployment topology of the agents representing the different locations such that the load is evenly distributed among client nodes and there are no client-side bottlenecks. This is especially important for a messaging benchmark where the server acts as mediator in interactions and significant amount of processing is executed on the client side.
As mentioned earlier, the goal of the horizontal topology is to exercise the ability of the system to handle increasing number of destinations. To achieve this, the workload is scaled by increasing the number of physical locations (SMs, DCs, etc) while keeping the traffic per location constant. A scaling parameter BASE is introduced that has to be set by the user before running the benchmark. As the BASE is increased, the overall message throughput rises until the system is saturated. For a run to be valid (passed), all queues must be stable and the 90th percentiles of delivery times must not exceed 5 sec.
The reported benchmark metric for a valid horizontal run is called SPECjms2007@Horizontal and is equal to the value of the BASE parameter at which the benchmark was run. The user is expected to run multiple times with increasing BASE values until he reaches the highest point at which the conditions for a valid run are still satisfied. The latter is normally submitted as an official result for publication by SPEC.
Figure 4 shows how the number of locations of each type is scaled as the BASE parameter is increased. The rates at which interactions are initiated by participants are fixed so that the traffic per location (and therefore also per destination) remains constant. The relative weights of the interactions are set based on a detailed business model of the supermarket supply chain which captures the interaction interdependencies. This model has several input parameters (e.g., total number of product types, size of supermarkets, average number of items sold per week) whose values are chosen in such a way that the following overall target messaging mix is achieved as close as possible:
The goal is to put equal weight on P2P and pub/sub messaging. Within each group the target relative weights of persistent vs. non-persistent messaging have been set according to the relative usage of these messaging styles in real-life applications. Figure 5 shows the message mix in the horizontal topology. When scaling the workload the proportions of the different types of messages remain constant. The sizes of the messages used in the various interactions have been chosen to reflect typical message sizes in real-life MOM applications. Pub/sub messages are generally much smaller than P2P messages due to the decoupled nature of the delivery mechanism.
In the vertical topology a fixed set of physical locations is used and the workload is scaled by increasing the rate at which interactions are executed. Similar to the horizontal case, a single parameter BASE is used as a scaling factor and the user is expected to scale the workload up to the highest BASE at which the conditions for a valid run are still satisfied. The metric for the vertical topology is called SPECjms2007@Vertical. Again, the relative weights of the interactions are set based on the business model of the supply chain scenario. Unlike the horizontal topology, however, the vertical topology places the emphasis on P2P messaging which accounts for 80% of the total message traffic.
The aim is to exercise the ability of the system to handle increasing traffic through a destination by processing messages in parallel. This aspect of MOM server performance is more relevant for P2P messaging (queues) than for pub/sub messaging where the message throughput is inherently limited by the speed at which subscribers can process incoming messages. Figure 6 shows the achieved message mix in the vertical topology. Again, when scaling the workload the message mix remains constant which is the expected behavior.
SPECjms2007 provides a flexible and robust tool that can be used for in-depth performance evaluation of MOM servers. The benchmark allows users to customize the workload to their needs by configuring it to stress selected features of the MOM infrastructure in a way that resembles a given target customer workload. However, in order to take advantage of this, users need to understand the way the workload is decomposed into components and which performance aspects are exercised by these components. In this article, we first introduced the business scenario and workload modeled by SPECjms2007 and then looked at the benchmark design and internal architecture. We presented a characterization of the workload looking at the interaction and message mixes and the way they are scaled. The characterization, on the one hand, aims to help users gain an in-depth understanding of the SPECjms2007 workload, so that they can interpret the benchmark results correctly. On the other hand, it provides the information needed to enable users to tailor the workload to their own requirements.
SPECjms2007 provides a representative workload for measuring the performance and scalability of MOM servers. It can be used for the following purposes:
Samuel Kounev serves as the release manager of SPEC's Java subcommittee and was actively involved in the development and specification of the SPECjAppServer and SPECjms set of industry-standard benchmarks. He is a BEA Technical Director and holds a Ph.D. in computer science from Technische Universitaet Darmstadt and a M.Sc. degree from the University of Sofia.
Kai Sachs was the lead developer of the SPECjms2007 benchmark. He works in the Databases & Distributed Systems Group at Technische Universitaet Darmstadt (Germany).