PersonalJava 1.1 Application Environment Memory Usage Technical Note

 

PersonalJava TM 1.1 Application Environment Memory Usage Technical Note

July 1998

Abstract

TM TM

This paper discusses in detail the optimizations which were made to achieve an approximately 28% decrease in class memory footprint from the PersonalJava 1.0 application environment to the PersonalJava 1.1 application environment. It is assumed that the reader is familiar with Java technology and the workings of the Java virtual machine* as specified in "The Java Virtual Machine Specification" by Lindholm and Yellin. Familiarity with one of the Java virtual machines implemented by Sun is also helpful.

Note that the optimizations discussed in this paper affect the implementation of the Java virtual machine and tools such as JavaCodeCompact only. No change has been made to either the Java Virtual Machine specification or the PersonalJava API.

Introduction

  1. The memory footprint of classes and their various components
    • Preloaded (mostly ROM use)
    • Dynamically loaded
  2. Native stack usage
  3. Java stacks
  4. Java heap usage
  5. C code size

Class Footprint Reductions

metadata

For small and embedded systems which lack fast dynamic class loading facilities such as a network connection, a local file system, or other permanent storage, it makes sense to have a "class preloader" to load the class off-line. Source licensees of the PersonalJava application environment receive a class preloader called JavaCodeCompact which performs this task. JavaCodeCompact creates the internal virtual machine data structures representing a class off-line, and lays them out in mostly read-only memory.

Changes were made that reduce the memory allocations required per class. Some of these changes occur only in JavaCodeCompact, thus affecting only the ROM footprint. Some of the other techniques apply both to preloaded and dynamically loaded classes, affecting both the ROM and RAM footprints.

 

Memory reductions that apply to preloaded classes only

quickenedi.e.
  • Reduced constant pools
  • During the preloading process, all symbolic references are resolved in constant pools, and all bytecodes that make references to these entries are quickened. Therefore NameAndType and Utf8 constants no longer need to be kept in class constant pools since Java bytecodes never have direct references to these constants. Method bytecodes are modified to refer to the updated constant pool indices after the deletion of the entries.

     

  • No type-tables for completely resolved constant pools
  • Constant pools whose references have been resolved have also had the referring bytecode instructions quickened. Therefore, there is no need for type information on these constant pools since the virtual machine is never going to be resolving their entries.

     

  • Shared string tables
  • Each Java class file has occurrences of Utf8 strings in its constant pool. When JavaCodeCompact preloads multiple classes, it keeps track of all occurrences of Utf8 strings in its input classes and outputs a shared Utf8 table to which individual preloaded classes point. Moreover, for any two strings A and B, for which B is a suffix of A, only string A is generated in the global string table. B would simply point to the middle of A. In this manner, common string suffixes are shared in order to save space.

     

  • Sharing Java String bodies
  • JavaCodeCompact resolves all class constant pool entries, including those of type String. It therefore creates Java String's with handles as if the runtime memory allocator created them. Each Java String is formed of an array of java.lang.Character's, an offset field, and a count field. When creating each String and associated handle, JavaCodeCompact has to generate the correct String object fields. In the PersonalJava 1.0 application environment, a separate handle was created for each embedded Character array. To improve upon this in PersonalJava 1.1 application environment, one giant Character array is created instead that contains the bodies of all Java String's. Each Java String is laid out to refer to portions of the same array with the right offset and count fields initialized. Separate handles for the separate Character arrays are therefore eliminated.

     

  • Table element sharing/merging
  • Many of the internal virtual machine data structures are short tables formed of small integers, like constant pool indices. Examples are tables listing the exceptions that a method throws, or the interfaces that a class implements. JavaCodeCompact tracks the contents of each of these tables, and shares them where appropriate.

     

  • Eliminating duplication by moving more data structures into read-only segments
  • JavaCodeCompact generates most of its preloaded class data structures in read-only segments. There are, however, certain kinds of data, such as Java static fields, which need to be stored in read-write segments in order to allow the virtual machine to write to them. The disadvantage of this in the PersonalJava 1.0 application environment was that a system employing preloaded classes needed to copy the data writeable by the virtual machine from persistent storage to RAM, essentially causing duplication of that segment. Changes were made in the virtual machine and JavaCodeCompact to allow the putting of as much data as possible in read-only memory or BSS (the uninitialized global data segment of the executable program) to reduce the memory size impact. The BSS section does not suffer from the extra copy problem, since it is only allocated at load time, and does not occupy any space in permanent storage.

     

    • Making fieldblocks read-only: static storage space is now emitted in the BSS section, instead of emitting fieldblocks representing statics in RAM. That way the fieldblocks are all read-only, at the price of an extra indirection to an extra word of storage per static field in the BSS section.
    •  

    • Making all method code read-only: changes were made in the virtual machine which obviate the need for writes into the instruction stream on interface invocations. That way, all method code can be placed in ROM.
    •  

    • Making Str2ID tables read-only: The Str2ID hash tables for class member signature data and interned Java String's have been made read-only. This was made possible through virtual machine changes which disallow writes into the hash tables.
    •  

Memory reductions that apply to both preloaded and dynamically loaded classes

  • Large static initializer removals:
  • The specification of the class file format does not allow for a way to express initialized data. For example, to initialize a large static constant array, the Java compiler needs to emit class initializer code that allocates an array and fills it in element by element. For large arrays typically encountered in code related to internationalization, these static initializers can get quite large. An example is java.lang.Character where the bulk of the class file size consists of such static initializer code. Classes with large static initializers were identified in the PersonalJava core class libraries and changed to make the initializers unnecessary, thus reducing their size drastically.

     

  • Removal of debug information:
  • The virtual machine and JavaCodeCompact have been changed to ignore all debugging information in classes, such as the source file name attribute, line number tables, and local variable tables. This information is only processed when a debugging option is passed to the virtual machine and JavaCodeCompact at build time, and ignored otherwise in order to save memory.

     

  • Data structure reductions:
  • In examining the class memory footprint requirements, the discovery was made that class metadata supporting loaded classes took up excessive space. For example, in many cases, a methodblock describing a class method takes up much more space than the bytecodes for the method. To rectify this, aggressive changes were made to the virtual machine data structures representing class components. The changes were done in concert with changes in JavaCodeCompact, allowing for the reduction of both the ROM footprint of preloaded classes and the RAM footprint of dynamically loaded classes.

     

    • Make methodblocks smaller: A total of 48 bytes from each of the methodblocks was removed, reducing the size of each methodblock from 92 bytes to 44 bytes. Among the fields removed are direct references to a method's name and signature, and compiler and debugging related information. On the PersonalJava application environment with nearly 7000 methods, this reduction alone translates to around 350 kbytes of ROM savings.
    •  

    • Remove alignment requirement for method tables: The PersonalJava 1.0 virtual machine required all method tables to be 32-byte aligned. This was in order to be able to use the rightmost 5 bits of a methods pointer to indicate an object's type. The virtual machine was modified to eliminate this requirement. This saved an average of 16-bytes per class that were wasted for the alignment requirement, and another 4 bytes in the class block that pointed to the non-aligned memory block allocated for the methodtable.
    •  

    • Remove cbSupername, cbFinalizer: The implementation of runtime class linking was changed to obviate the need for the classblock fields indicating the class superclass name and the class finalizer method. This saved 8 bytes per class.
    •  

    • Remove wasted first word of the methodtable: A word of space per class was wasted in the PersonalJava 1.0 application environment when storing its methodtable. The virtual machine implementation was changed to be able to pack the methodtable tighter and to eliminate the extra word of storage. This extra word was a vestige of the original implementation and was not used to store any data.
    •  

  • Enforce class-file size limits in runtime data structures The PersonalJava 1.0 virtual machine had wasteful internal representations for some class file components which were constrained in size by the class file specification. For example, for certain types of data represented in 16-bits in the class file format, the PersonalJava 1.0 virtual machine allocated a full 32-bit machine word. The types of data which were changed to conform to the class file constraints include:
    • Exception table lengths
    • PC values in line number tables and exception tables
    • Method code lengths
    The result is the halving of the widths of these fields in the internal representations yielding more memory savings.

Numbers

The first comparison is of the PersonalJava 1.1 application environment with all optional packages included (rmi, sql, math, zip, and code signing), with a similarly configured Java 1.1.6 application environment, adjusted to exclude components unsupported by the PersonalJava application environment (e.g. java.security.acl).

 

PersonalJava 1.1 
Application Environment
Java 1.1.6 
Application Environment
No. of classes 1
913
911
Total size of class files
1.997M
2.022M
Total class memory footprint
1.593M
2.222M
% of RAM in total footprint
0.2%
6.5%
1 For the PersonalJava 1.1 application environment, this number includes core as well as all optional packages. For the Java application environment, this number includes the packages which would be necessary to implement as close to the same functionality as for a fully configured PersonalJava application environment.

The Java 1.1.6 application environment comes out to be around 39.5% larger. It also has about 32.5 times the RAM requirement of the PersonalJava 1.1 application environment.

The second comparison is of the PersonalJava 1.1 application environment with no optional packages included, with the PersonalJava 1.0 application environment.

 

PersonalJava 1.0 
Application Environment
PersonalJava 1.1 
Application Environment
No. of classes 2
641
689
Total size of class files
1.284M
1.506M
Total class memory footprint
1.402M
1.227M
% of RAM in total footprint
7.6%
0.3%
2 For both the PersonalJava 1.0 and 1.1 application environments, this number includes core packages only. Since the PersonalJava 1.1 application environment has a richer set of features than the PersonalJava 1.0 application environment, there are more classes in the PersonalJava 1.1 application environment.

The PersonalJava 1.0 application environment comes out to be 14% bigger, even though it has only 83% of the .class data of the PersonalJava 1.1 application environment. Normalized, this translates to the class footprint of the PersonalJava 1.0 application environment being 38% bigger than that of the PersonalJava 1.1 application environment. Similarly, the RAM requirement for the classes has gone down as much as 25-fold when measured relative to total class memory footprint. The normalized value is around 30-fold.

Note that the results shown above are for the core classes of the PersonalJava application environment which are stored primarily in ROM. More important, perhaps, is the memory usage savings which can be achieved in RAM. By recognizing that the core classes contain a representative cross section of the classes which will appear in PersonalJava applications, the above analysis can be extended to dynamically loaded applications which are the primary consumers of RAM. The result is that an equivalent savings is expected for RAM usage.

Native Stack Size Reductions

  • Figuring out what the typical native stack requirements are
  • Ensuring system robustness against stack overflow situations caused by tight stacks
  • Recursion removal and introduction of dynamic stack checks:
  • First, all self-recursions and call cycles in the native code which could result in uncontrolled stack overflow were identified. Most of the cycles could be eliminated by iterative rewrites. There were, however, some call cycles that could not be removed. An example is native code to Java bytecode transitions and back: there can essentially be several such transitions, with no real bound on the number. On each such cycle, a routine was chosen which employs a dynamic stack check at its beginning, testing for remaining stack space before executing the current routine. These test points are called safe points, since in the case of insufficient stack space, these are the places to throw a StackOverflowError. If execution were to continue, there would be an uncontrolled stack overflow with possible memory corruption. If the remaining stack space at a safe point is under a certain limit (the stack red zone), a StackOverflowError is thrown.

     

  • Making the stack red zone smaller:
  • After having dealt with recursion, all call paths requiring excessive stack usage were identified. The frames that "contributed" the most to these paths were chosen and analyzed on a case by case basis. It was possible to significantly reduce the stack usage of most of these by moving large local variable storage to global static storage for non-reentrant routines and by dynamically allocating storage for reentrant routines. In addition, the worst case stack requirement was reduced on any path between two stack checks by introducing extra stack checks. This allowed the stack red zone to be made smaller.

    These efforts resulted in well-defined, exact stack requirements. Static analysis of stack consumption performed on Solaris TM /SPARC TM revealed that the appropriate red zone in the virtual machine implementation is 3.3 kbytes plus the maximum stack consumption of the underlying library functions. As for native method implementations including AWT and socket features, another red zone called the native red zone was introduced. Unfortunately, since some Motif routines require large stacks, the native red zone had to be set to 80 kbytes for the reference implementation on Solaris/SPARC.

  • Checking stack consumption of an actual application:
  • The virtual machine's stack usage was checked by running actual applications like Personal Applications Browser (a Web browser). It was found that the worst case stack usage at any stack check point in the virtual machine code was less than 12 kbytes. As for the check point for the "native red zone", the maximum usage was less than 10 kbytes. Thus, a typical upper bound for the thread stack size of the SPARC implementation would be:

    upperBound = max {12k + 3.3k + libfuncStackMax, 10k + nativeImplStackMax}

    where libfuncStackMax is the maximum stack consumption of the library functions used by the virtual machine like vfprintf, abort, free, etc., and nativeImplStackMax is the maximum stack consumption of the native method implementations including underlying libraries.

safe pointconservative

Note that the SPARC architecture consumes more stack because of its register windows architecture, and that a lower default stack size can be expected for embedded purpose CPUs.

Java Stack Size Reductions

chunky

Conclusion

Copyright © 1998 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, CA 94303 USA. All rights reserved.

*As used on this web site, the terms "Java virtual machine" or "JVM" mean a virtual machine for the Java platform.

Left Curve
Java SDKs and Tools
Right Curve
Left Curve
Java Resources
Right Curve
JavaOne Banner
Java 8 banner (182)