Better Profiling through Code Hotswapping: A Conversation with JFluid Project Lead, Misha Dmitriev

By Janice J. Heiss, March 2005  

Late in 2000, Misha Dmitriev was working at Sun Microsystems Laboratories on what is known as HotSwap functionality of the HotSpot Java Virtual Machine, with the goal of changing code on the fly in running programs. The initial idea was to use HotSwap functionality to "fix-and-continue" code during debugging, without having to stop, recompile, and restart programs. Thus, buggy classes could be replaced on the spot and programmers could immediately determine if a fix worked. Dmitriev also hoped that HotSwap functionality could be used to update running applications in, for example, banking, stock trading, or online ticket reservation systems where downtime costs are high. After some experimentation and deliberation, Sun Labs concluded that, for the time being, such a project was impractical. But what, Dmitriev asked himself, if the same approach could be used to add code to a program in such a way that did not alter its semantics or behavior, but enabled the program to emit useful information about what's happening internally? For example, how much time it takes to execute code, what objects it allocates, and so on. Out of such an observation, the JFluid profiling tool was born, and found its way into the NetBeans IDE, a free and open source IDE that enables programmers to develop cross-platform desktop, web, and mobile/wireless applications.

Dmitriev initially studied at Moscow State Technical University in Moscow, Russia. He then spent 4 years at the University of Glasgow, UK, doing research on Persistent Java with a group that collaborated with Sun Labs. He received his Ph.D. in May 2001, by which time he was already working at Sun Labs California with the HotSpot Java Virtual Machine (JVM) production group. He later initiated the Javamake project, and implemented the first version of the Javamake tool.

At the end of 2001, he completed the initial version of the dynamic bytecode instrumentation functionality in the HotSpot Java Virtual Machine (JVM), and, more recently, has been fast at work on the JFluid project, which has generated advanced profiling techniques and tools through dynamic bytecode instrumentation. He is currently the JFluid/NetBeans Profiler Project Lead, at Java Tools, at Sun. We met with him to discuss recent developments in JFluid and the NetBeans IDE.

question When you first experimented with HotSwap functionality, what got you thinking about profiling?

Downloading NetBeans IDE and JFluid

To download NetBeans IDE generally:

For early access download of NetBeans IDE 4.1:

To download JFluid:  

answer Profiling, that is, collecting information about performance, was the most obvious application. By 2000, most of the profilers for Java technology already used the so-called bytecode instrumentation to collect performance data. Tools that use bytecode instrumentation add, or "inject," special small code snippets into an application in order to gather data about it at run time. The problem with these tools was, and still is, that they can only do this instrumentation once -- at application startup time. As a result, your instrumented application would run slower, sometimes much slower -- say 10 or 20 times! It could take ages to gather interesting data, or just to get to the point where something interesting (for example, a memory leak) starts to happen. So, I had the idea of performing bytecode instrumentation using code hotswapping, rather than when the program starts. Code hotswapping is probably the most powerful way to address the performance problems of profiling, while still collecting useful data.

"Code hotswapping is probably the most powerful way to address the performance problems of profiling, while still collecting useful data."
-Misha Dmitriev

Project Lead for JFluid, Sun Software, at Sun Microsystems

As for profiling in general: it is proven that it gets you better software. However, due to the complexity and overhead of the existing tools, most developers have engaged in profiling only when they had to. And some developers never try it at all. Now, we're going to make profiling a natural part of the software development cycle. That's because Java applications are growing in size and complexity. Take, for instance, web and Java 2 Platform, Enterprise Edition (J2EE) applications -- they run on top of application servers, which themselves are typically very large Java applications! Thus, I believe, the traditional edit-compile-debug cycle should be replaced with edit-compile-debug-test-profile cycle. This will benefit developers a lot, and achieving it is mostly a matter of making profiling faster and easier.

Addressing the Performance Problems of Profiling

question How does hotswapping address the performance problems of profiling?

answer It can help in multiple ways.

First, you can instrument your application when, or if, you observe a problem with it. So you collect data only when you want, and avoid the overhead of collecting data at unnecessary times.

Second, after you've collected enough to know what's interesting and what's not, you can collect data selectively. For example, once you've discovered that only certain types of objects are leaking, you can stop collecting data for other types. That can save a lot of time.

Third, in the case of CPU profiling, where the collected data tells you how much time your application spent on given parts of its code, the tool itself can select what to instrument, based on the user specifying a very simple and general condition. This condition is, in fact, just a single Java method that a user declares a "call tree root" or "entrypoint". So, for example, you can point our tool at the doGet() method of your servlet and tell it: I want to measure how much time it takes to execute this method plus everything it calls, directly or recursively. The tool will automatically "discover," instrument, and measure time for all methods in the call graph of your doGet(). This may include methods of your web application, libraries, and application server as well. And that makes sense -- the source of the performance problem can be anywhere on this stack.

On the other hand, the tool will not instrument the code of other servlets, and, typically, a large part of the application server will not be touched as well. Thus, all the untouched code will still run at full speed. This cuts overhead dramatically when compared with other profilers. In our work, the overhead goes down from 50 to 5 per cent, and you still collect all the information you really need.

This approach makes a difference. Customers report that some well-known profilers crash for some applications -- typically highly multithreaded, CPU-intensive ones. They claim that selective instrumentation implemented in JFluid is the only way for them to understand happening inside their code when it runs under realistic workloads.

question What if selective instrumentation is still not enough to reduce the performance overhead? For example, if the user needs to look at a large number of classes at once?

answer In that case, they can try other options available in JFluid that are not related to selective instrumentation. For example, for CPU profiling we have what is called a "sampled instrumentation" method. It's a combination of instrumentation and sampling-based profiling.

Instrumentation Versus Sampling

question What's the difference between profiling using instrumentation and profiling using sampling?

answer These are two different methods for CPU profiling. Instrumentation is when the tool injects code that calls a timer, say, at the beginning and end of each method of your application. You get the time that the method was running by essentially subtracting the two timestamps obtained by these calls.

Sampling is something less obvious. It's a statistical method. If you look at the stack of your running application periodically (say, 10 times a second) for long enough (say for 10 seconds), and you observed that method "foo" was on top of stack 50 times out of the total of 100 "samples" you have taken, you can conclude that the application spent roughly 50 per cent of the time executing this method. Of course, with only 100 samples, taken relatively infrequently, this may prove to be a bad estimate. But generally, with the higher sampling frequency and the application running for long enough, it's accurate, and the overhead is low.

So this may be very useful -- except that it doesn't tell you how many times your bottleneck methods were called. That is, it says that the application spent 5 seconds in "foo". Was it because "foo" was called a million times, or there were just 5 calls for 1 second each? This could be a very important difference. In the first case, you should definitely minimize the number of calls to "foo". In the second, you should look at the code of "foo" itself. But classical sampling doesn't give you this information.

To address this, we implemented a patent-pending technology in JFluid that combines the advantages of both methods. It shows you the number of method invocations for the profiled code, but avoids the overhead of calling a timer function (which, surprisingly, is expensive) every time your application calls one of its methods. Its overhead is somewhere in between pure instrumentation and pure sampling -- which may be a big improvement in certain situations. And for highly call-intensive applications this method may return more accurate results.

Performance Overhead in Memory Profiling

question And what about the issue of performance overhead of memory profiling?

answer That may be a concern as well. To address it, we came up with another "semi-statistical" optimization. The most expensive part of memory profiling is collecting stack traces that tell you where (in what place in the program and through what chain of calls) your objects have been allocated. A lot of this data is the same for many objects. Simplifying it a little, you will see that 500 out of the total of 1000 Strings have been allocated in method "foo", 400 in method "bar", and 100 elsewhere. So if you sample stacks for object allocations, not every time an object is created, but perhaps for only one object out of 10, your savings could be substantial. And you get essentially the same data as before -- in terms of relative values. This is particularly relevant for large server-side applications that tend to create and dispose of thousands of objects a second.

Of course, the real implementation is a bit more complex than what I've just described, but the basic idea is the same.

Memory Leak Debugging

question People often cite memory leak debugging as one of the most important issues they face in this area. Does JFluid address this?

answer Yes. To the best of our knowledge, most of the other tools offer just one method of memory leak debugging. They essentially allow the user to take and compare two or more heap snapshots. These snapshots are lists where you see object types, number of live objects of each type, and so on. If you see that the number of objects of some type went up significantly, or that they are present in the heap when you believe they all should have been garbage collected, you can assume that these objects are leaking. The problem is that there are usually enough object types about which such an assumption can be made, object numbers may go up and down, and so on. Much time can be wasted trying to find a real memory leak in this way.

"JFluid allows you to pinpoint leaking objects very precisely after your program has run for a while."
-Misha Dmitriev

Project Lead for JFluid, Sun Software, at Sun Microsystems

The JFluid approach is based on the observation that the most common and dangerous kind of memory leak, a slow-growing one, is characterized by a specific pattern of object creation and garbage collection. In essence, leaking objects are allocated fairly regularly, and some persist. After each round of garbage collection, some freshly-created objects remain alive forever. Pretty simple. But if the tool can keep track of this characteristic of objects, which we call "surviving generations numbers," then developers can instantly distinguish between objects that may be allocated in large numbers at one point in time and disappear completely the next minute, or objects that have been allocated in small numbers once and forever -- from real leaking objects whose numbers build up slowly but steadily.

So JFluid allows you to pinpoint leaking objects very precisely after your program has run for a while.

Making JFluid Easy to Use

question JFluid sounds great for expert users. But how about ordinary folks, who may not have time to fiddle with all these interesting checkboxes and controls?

answer Good question. In fact, the JFluid tool, when it was an experimental project at Sun Labs, had a non-intuitive, and not very rich, UI. But once it was transferred to NetBeans, it got into the hands of experienced UI designers, in particular the present co-lead for this project, Ian Formanek. These guys have tremendously improved the UI and the general concept of how you drive the tool.

So now, once you install it and click on the "Profile" button in NetBeans IDE, you see a set of predefined tasks for the most common types of profiling, with just one or two, or even no settings in each of them. Selection of these tasks and settings was not simple at all, but now, we believe, they are likely to suit your needs 50 per cent of the time or more. Once you become familiar with the tool, you can create your own profiling "configurations," adjusting and customizing where you wish.

We've also incorporated many UI-related improvements in the last several months. We've considerably improved data representation, now provide class name filters for both instrumentation (what the user wants to profile) and results (what the user wants to see), and thread profiling.

The Advantages of the NetBeans IDE

question JFluid was, at first, a standalone tool, but now comes as a part of the NetBeans IDE. What's the advantage of this approach?

answer To many users of the NetBeans IDE, the advantage is substantial. For free, they acquire the equivalent of commercial tools that cost between $300 and $1000 per license. JFluid doesn't include all the features of such well-established tools as JProbe or Optimizeit, but it's evolving pretty fast. And it has a number of features that these tools lack.

Furthermore, tight integration of the profiler with the IDE provides real benefits. For instance, there's no need to launch an additional tool when you profile your application. Also, the profiler knows about standard types of projects in NetBeans IDE, such as freeform or web applications, and can tailor its operations accordingly. In most cases, you can simply click "Profile" in the IDE, and it works; no setup is needed. This, along with the free availability of the JFluid profiler, should make profiling accessible to many more developers, especially those who might have previously been intimidated by the complexity of performance tools.

Finally, we anticipate benefits from future integration. For example, we're considering collecting more kinds of data, such as code paths, and adding more forms of presentation, such as graphical data alongside the program text. We are considering specifying special kinds of profiling input in the same way.

The Future of JFluid

question What are your plans for JFluid?

answer We have to provide features other tools have, that are essential for some users, for example, remote profiling, in which the profiled application runs on one machine and the tool on another, or saving and retrieving the collected data. These aren't big challenges but we need time to implement them. We also want to make our tool work on JDK 1.5, in which some of the internal APIs that support profiling have changed and more new functionality has been added.

"I expect profiling in general, and JFluid in particular, to merge more and more with debugging."
-Misha Dmitriev

Project Lead for JFluid, Sun Software, at Sun Microsystems

We also plan to simplify the profiler usage, making it smarter, more adaptive, and self-tuning. Currently, profiling and interpreting the results still requires expertise and some knowledge of profiling internals. I'm not sure we can automate everything, but we want, whenever possible, to simplify things through various "wizards" and the like.

In the more distant future, I expect profiling in general, and JFluid in particular, to merge more and more with debugging. For example, memory leak debugging is provided in many modern profilers for the Java language, including our tool, but even the name of this feature suggests that it is more about finding bugs in programs than about measuring performance.

Users are not only interested in performance numbers, but they want to find out what causes their performance problems, and what can be done to address them. We want to respond to this.

Send Us Feedback

question Any final comment for users?

"Please try our tool and send us your feedback."
-Misha Dmitriev

Project Lead for JFluid, Sun Software, at Sun Microsystems

answer Please try our tool and send us your feedback. Approximately once a month, we provide milestone releases of the profiler in order to get feedback. We can usually handle bug reports and provide patches within days -- to get proof, just check out our user mail list archives.

And, of course, we welcome users' expert advice. Many features in this tool exist because people outside the project suggested them. If you have a good idea, please don't hesitate to share it with us.

Special thanks to Tim Boudreau, Ian Formanek, and Charlie Hunt of Sun Microsystems, and Brian Leonard of NetBeans for their contributions to this article.

See Also

Rate and Review
Tell us what you think of the content of this page.
Excellent   Good   Fair   Poor  
Your email address (no reply is possible without an address):
Sun Privacy Policy

Note: We are not able to respond to all submitted comments.