The Java Language Environment

Java--Simple and Familiar

CHAPTER 2

You know you've achieved perfection in design,

Not when you have nothing more to add,

But when you have nothing more to take away.

Antoine de Saint Exupery.

In his science-fiction novel, The Rolling Stones, Robert A. Heinlein comments:

Every technology goes through three stages: first a crudely simple and quite unsatisfactory gadget; second, an enormously complicated group of gadgets designed to overcome the shortcomings of the original and achieving thereby somewhat satisfactory performance through extremely complex compromise; third, a final proper design therefrom.

Heinlein's comment could well describe the evolution of many programming languages. Java presents a new viewpoint in the evolution of programming languages--creation of a small and simple language that's still sufficiently comprehensive to address a wide variety of software application development. Although Java is superficially similar to C and C++, Java gained its simplicity from the systematic removal of features from its predecessors. This chapter discusses two of the primary design features of Java, namely, it's simple (from removing features) and familiar (because it looks like C and C++). The next chapter discusses Java's object-oriented features in more detail. At the end of this chapter you'll find a discussion on features eliminated from C and C++ in the evolution of Java.

Design Goals

Simplicity is one of Java's overriding design goals. Simplicity and removal of many "features" of dubious worth from its C and C++ ancestors keep Java relatively small and reduce the programmer's burden in producing reliable applications. To this end, Java design team examined many aspects of the "modern" C and C++ languages 1 to determine features that could be eliminated in the context of modern object-oriented programming.

Another major design goal is that Java look familiar to a majority of programmers in the personal computer and workstation arenas, where a large fraction of system programmers and application programmers are familiar with C and C++. Thus, Java "looks like" C++. Programmers familiar with C, Objective C, C++, Eiffel, Ada, and related languages should find their Java language learning curve quite short--on the order of a couple of weeks.

To illustrate the simple and familiar aspects of Java, we follow the tradition of a long line of illustrious programming books by showing you the HelloWorld program. It's about the simplest program you can write that actually does something. Here's HelloWorld implemented in Java.



    class HelloWorld {
    static public void main(String args[]) {
        System.out.println("Hello world!");

 }
    }
            
        

This example declares a class named HelloWorld. Classes are discussed in the next chapter on object-oriented programming, but in general we assume the reader is familiar with object technology and understands the basics of classes, objects, instance variables, and methods.

Within the HelloWorld class, we declare a single method called main() which in turn contains a single method invocation to display the string "Hello world!" on the standard output. The statement that prints "Hello world!" does so by invoking the println method of the out object. The out object is a class variable in the System class that performs output operations on files. That's all there is to HelloWorld.

2.1 Main Features of the Java TM Programming Language

The Java TM programming language follows C++ to some degree, which carries the benefit of it being familiar to many programmers. This section describes the essential features of the Java programming language and points out where the language diverges from its ancestors C and C++.

2.1.1 Primitive Data Types

Other than the primitive data types discussed here, everything in the Java programming language is an object. Even the primitive data types can be encapsulated inside library-supplied objects if required. The Java programming language follows C and C++ fairly closely in its set of basic data types, with a couple of minor exceptions. There are only three groups of primitive data types, namely, numeric types, character types, and Boolean types.

Numeric Data Types

Integer numeric types are 8-bit byte, 16-bit short, 32-bit int, and 64-bit long. The 8-bit byte data type in Java has replaced the old C and C++ char data type. Java places a different interpretation on the char data type, as discussed below.

There is no unsigned type specifier for integer data types in Java.

Real numeric types are 32-bit float and 64-bit double. Real numeric types and their arithmetic operations are as defined by the IEEE 754 specification. A floating point literal value, like 23.79, is considered double by default; you must explicitly cast it to float if you wish to assign it to a float variable.

Character Data Types

Java language character data is a departure from traditional C. Java's char data type defines a sixteen-bit Unicode character. Unicode characters are unsigned 16-bit values that define character codes in the range 0 through 65,535. If you write a declaration such as

char myChar = `Q';

you get a Unicode (16-bit unsigned value) type initialized to the Unicode value of the character Q. By adopting the Unicode character set standard for its character data type, Java language applications are amenable to internationalization and localization, greatly expanding the market for world-wide applications.

Boolean Data Types

Java added a Boolean data type as a primitive type, tacitly ratifying existing C and C++ programming practice, where developers define keywords for TRUE and FALSE or YES and NO or similar constructs. A Java boolean variable assumes the value true or false. A Java programming language boolean is a distinct data type; unlike common C practice, a Java programming language boolean type can't be converted to any numeric type.

2.1.2 Arithmetic and Relational Operators

All the familiar C and C++ operators apply. The Java programming language has no unsigned data types, so the >>> operator has been added to the language to indicate an unsigned (logical) right shift. Java also uses the + operator for string concatenation; concatenation is covered below in the discussion on strings.

2.1.3 Arrays

In contrast to C and C++, the Java programming language arrays are first-class language objects. An array in the Java programming language is a real object with a run-time representation. You can declare and allocate arrays of any type, and you can allocate arrays of arrays to obtain multi-dimensional arrays.

You declare an array of, say, Points (a class you've declared elsewhere) with a declaration like this:

Point myPoints[];

This code states that myPoints is an uninitialized array of Points. At this time, the only storage allocated for myPoints is a reference handle. At some future time you must allocate the amount of storage you need, as in:

myPoints = new Point[10];

to allocate an array of ten references to Points that are initialized to the null reference. Notice that this allocation of an array doesn't actually allocate any objects of the Point class for you; you will have to also allocate the Point objects, something like this:


    
int  i;
for (i = 0;  i < 10;  i++) 
{

myPoints[i] = new Point();

}
    
    

Access to elements of myPoints can be performed via normal C-style indexing, but all array accesses are checked to ensure that their indices are within the range of the array. An exception is generated if the index is outside the bounds of the array.

The length of an array is stored in the length instance variable of the specific array: myPoints.length contains the number of elements in myPoints. For instance, the code fragment:

howMany = myPoints.length;

would assign the value 10 to the howMany variable.

The C notion of a pointer to an array of memory elements is gone, and with it, the arbitrary pointer arithmetic that leads to unreliable code in C. No longer can you walk off the end of an array, possibly trashing memory and leading to the famous "delayed-crash" syndrome, where a memory-access violation today manifests itself hours or days later. Programmers can be confident that array checking in Java will lead to more robust and reliable code.

2.1.4 Strings

Strings are Java programming language objects, not pseudo-arrays of characters as in C. There are actually two kinds of string objects: the String class is for read-only (immutable) objects. The StringBuffer class is for string objects you wish to modify (mutable string objects).

Although strings are Java programming language objects, Java compiler follows the C tradition of providing a syntactic convenience that C programmers have enjoyed with C-style strings, namely, the Java compiler understands that a string of characters enclosed in double quote signs is to be instantiated as a String object. Thus, the declaration:

String hello = "Hello world!";

instantiates an object of the String class behind the scenes and initializes it with a character string containing the Unicode character representation of "Hello world!".

Java technology has extended the meaning of the + operator to indicate string concatenation. Thus you can write statements like:

System.out.println("There are " + num + " characters in the file.");

This code fragment concatenates the string "There are " with the result of converting the numeric value num to a string, and concatenates that with the string " characters in the file.". Then it prints the result of those concatenations on the standard output.

String objects provide a length() accessor method to obtain the number of characters in the string.

2.1.5 Multi-Level Break

The Java programming language has no goto statement. To break or continue multiple-nested loop or switch constructs, you can place labels on loop and switch constructs, and then break out of or continue to the block named by the label. Here's a small fragment of code from the Java programming language built-in String class:




test:  for (int i = fromIndex; i + max1 <= max2; i++) {

 if (charAt(i) == c0) {

    for (int k = 1; k<max1; k++) {

      if (charAt(i+k) != str.charAt(k)) {

        continue test;
    }

      }     /*  end of inner for loop  */

        }

           }             /*  end of outer for loop  */

    
  

The continue test statement is inside a for loop nested inside another for loop. By referencing the label test, the continue statement passes control to the outer for statement. In traditional C, continue statements can only continue the immediately enclosing block; to continue or exit outer blocks, programmers have traditionally either used auxiliary Boolean variables whose only purpose is to determine if the outer block is to be continued or exited; alternatively, programmers have (mis)used the goto statement to exit out of nested blocks. Use of labelled blocks in the Java programming language leads to considerable simplification in programming effort and a major reduction in maintenance.

The notion of labelled blocks dates back to the mid-1970s, but it hasn't caught on to any large extent in modern programming languages. Perl is another modern programming language that implements the concept of labelled blocks. Perl's next label and last label are equivalent to continue label and break label statements in Java.

2.1.6 Memory Management and Garbage Collection

C and C++ programmers are by now accustomed to the problems of explicitly managing memory: allocating memory, freeing memory, and keeping track of what memory can be freed when. Explicit memory management has proved to be a fruitful source of bugs, crashes, memory leaks, and poor performance.

Java technology completely removes the memory management load from the programmer. C-style pointers, pointer arithmetic, malloc, and free do not exist. Automatic garbage collection is an integral part of Java and its run-time system. While Java technology has a new operator to allocate memory for objects, there is no explicit free function. Once you have allocated an object, the run-time system keeps track of the object's status and automatically reclaims memory when objects are no longer in use, freeing memory for future use.

Java technology's memory management model is based on objects and references to objects. Java technology has no pointers. Instead, all references to allocated storage, which in practice means all references to an object, are through symbolic "handles". The Java technology memory manager keeps track of references to objects. When an object has no more references, the object is a candidate for garbage collection.

Java technology's memory allocation model and automatic garbage collection make your programming task easier, eliminate entire classes of bugs, and in general provide better performance than you'd obtain through explicit memory management. Here's a code fragment that illustrates when garbage collection happens:


 
class ReverseString {

public static String reverseIt(String source) {

int i, len = source.length();

StringBuffer dest = new StringBuffer(len);

for (i = (len - 1); i >= 0; i--) {

  dest.appendChar(source.charAt(i));

}

 return dest.toString();

    }

}
    
    

The variable dest is used as a temporary object reference during execution of the reverseIt method. When dest goes out of scope (the reverseIt method returns), the reference to that object has gone away and it's then a candidate for garbage collection.

2.1.7 The Background Garbage Collector

The Java technology garbage collector achieves high performance by taking advantage of the nature of a user's behavior when interacting with software applications such as the HotJava TM browser. The typical user of the typical interactive application has many natural pauses where they're contemplating the scene in front of them or thinking of what to do next. The Java run-time system takes advantage of these idle periods and runs the garbage collector in a low priority thread when no other threads are competing for CPU cycles. The garbage collector gathers and compacts unused memory, increasing the probability that adequate memory resources are available when needed during periods of heavy interactive use.

This use of a thread to run the garbage collector is just one of many examples of the synergy one obtains from Java technology's integrated multithreading capabilities--an otherwise intractable problem is solved in a simple and elegant fashion.

2.1.8 Integrated Thread Synchronization

Java technology supports multithreading, both at the language (syntactic) level and via support from its run-time system and thread objects. While other systems have provided facilities for multithreading (usually via "lightweight process" libraries), building multithreading support into the language itself provides the programmer with a much easier and more powerful tool for easily creating thread-safe multithreaded classes. Multithreading is discussed in more detail in Chapter 7.

2.2 Features Removed from C and C++

The earlier part of this chapter concentrated on the principal features of Java. This section discusses features removed from C and C++ in the evolution of Java.

The first step was to eliminate redundancy from C and C++. In many ways, the C language evolved into a collection of overlapping features, providing too many ways to say the same thing, while in many cases not providing needed features. C++, in an attempt to add "classes in C", merely added more redundancy while retaining many of the inherent problems of C.

2.2.1 No More Typedefs, Defines, or Preprocessor

Source code written in Java is simple. There is no preprocessor, no #define and related capabilities, no typedef, and absent those features, no longer any need for header files. Instead of header files, Java language source files provide the declarations of other classes and their methods.

A major problem with C and C++ is the amount of context you need to understand another programmer's code: you have to read all related header files, all related #defines, and all related typedefs before you can even begin to analyze a program. In essence, programming with #defines and typedefs results in every programmer inventing a new programming language that's incomprehensible to anybody other than its creator, thus defeating the goals of good programming practices.

In Java, you obtain the effects of #define by using constants. You obtain the effects of typedef by declaring classes--after all, a class effectively declares a new type. You don't need header files because the Java compiler compiles class definitions into a binary form that retains all the type information through to link time.

By removing all this baggage, Java becomes remarkably context-free. Programmers can read and understand code and, more importantly, modify and reuse code much faster and easier.

2.2.2 No More Structures or Unions

Java has no structures or unions as complex data types. You don't need structures and unions when you have classes; you can achieve the same effect simply by declaring a class with the appropriate instance variables.

The code fragment below declares a class called Point.



class Point extends Object {

  double  x;
  double  y;

 //methods to access the instance variables

   }
 

The following code fragment declares a class called Rectangle that uses objects of the Point class as instance variables.


  
class Rectangle extends Object {

Point  lowerLeft;
Point  upperRight;

//methods to access the instance variables
 }

In C you'd define these classes as structures. In Java, you simply declare classes. You can make the instance variables as private or as public as you wish, depending on how much you wish to hide the details of the implementation from other objects.

2.2.3 No Enums

Java has no enum types. You can obtain something similar to enum by declaring a class whose only raison d'etre is to hold constants. You could use this feature something like this:


  
class Direction extends Object {

public static final int North = 1;

public static final int South = 2;

public static final int East  = 3;

public static final int West  = 4;

}
    
    

You can now refer to, say, the South constant using the notation Direction.South.

Using classes to contain constants in this way provides a major advantage over C's enum types. In C (and C++), names defined in enums must be unique: if you have an enum called HotColors containing names Red and Yellow, you can't use those names in any other enum. You couldn't, for instance, define another Enum called TrafficLightColors also containing Red and Yellow.

Using the class-to-contain-constants technique in Java, you can use the same names in different classes, because those names are qualified by the name of the containing class. From our example just above, you might wish to create another class called CompassRose:


  
class CompassRose extends Object {

public static final int North     = 1;

public static final int NorthEast = 2;

public static final int East      = 3;

public static final int SouthEast = 4;

public static final int South     = 5;

public static final int SouthWest = 6;

public static final int West      = 7;

public static final int NorthWest = 8;

}

There is no ambiguity because the name of the containing class acts as a qualifier for the constants. In the second example, you would use the notation CompassRose.NorthWest to access the corresponding value. Java effectively provides you the concept of qualified enums, all within the existing class mechanisms.

2.2.4 No More Functions

Java has no functions. Object-oriented programming supersedes functional and procedural styles. Mixing the two styles just leads to confusion and dilutes the purity of an object-oriented language. Anything you can do with a function you can do just as well by defining a class and creating methods for that class. Consider the Point class from above. We've added public methods to set and access the instance variables:



class Point extends Object {

double  x;
double  y;

public void setX(double x) {

  this.x = x;

}

public void setY(double y) {

  this.y = y;

}

public double x() {

return x;
}

public double y() {

return y;

  }

}
   
   

If the x and y instance variables are private to this class, the only means to access them is via the public methods of the class. Here's how you'd use objects of the Point class from within, say, an object of the Rectangle class:



class Rectangle extends Object {

  Point  lowerLeft;
  Point  upperRight;

public void setEmptyRect() {

  lowerLeft.setX(0.0);       

  lowerLeft.setY(0.0);

  upperRight.setX(0.0);

  upperRight.setY(0.0);

        }

    }

   
 

It's not to say that functions and procedures are inherently wrong. But given classes and methods, we're now down to only one way to express a given task. By eliminating functions, your job as a programmer is immensely simplified: you work only with classes and their methods.

2.2.5 No More Multiple Inheritance

Multiple inheritance--and all the problems it generates--was discarded from Java. The desirable features of multiple inheritance are provided by interfaces--conceptually similar to Objective C protocols.

An interface is not a definition of a class. Rather, it's a definition of a set of methods that one or more classes will implement. An important issue of interfaces is that they declare only methods and constants. Variables may not be defined in interfaces.

2.2.6 No More Goto Statements

Java has no goto statement 1. Studies illustrated that goto is (mis)used more often than not simply "because it's there". Eliminating goto led to a simplification of the language--there are no rules about the effects of a goto into the middle of a for statement, for example. Studies on approximately 100,000 lines of C code determined that roughly 90 percent of the goto statements were used purely to obtain the effect of breaking out of nested loops. As mentioned above, multi-level break and continue remove most of the need for goto statements.

2.2.7 No More Operator Overloading

There are no means provided by which programmers can overload the standard arithmetic operators. Once again, the effects of operator overloading can be just as easily achieved by declaring a class, appropriate instance variables, and appropriate methods to manipulate those variables. Eliminating operator overloading leads to great simplification of code.

2.2.8 No More Automatic Coercions

Java prohibits C and C++ style automatic coercions. If you wish to coerce a data element of one type to a data type that would result in loss of precision, you must do so explicitly by using a cast. Consider this code fragment:



int  myInt;

double  myFloat = 3.14159;

myInt = myFloat;

   
 

The assignment of myFloat to myInt would result in a compiler error indicating a possible loss of precision and that you must use an explicit cast. Thus, you should re-write the code fragments as:



int  myInt;

double  myFloat = 3.14159;

myInt = (int)myFloat;

   
   

2.2.9 No More Pointers

Most studies agree that pointers are one of the primary features that enable programmers to inject bugs into their code. Given that structures are gone, and arrays and strings are objects, the need for pointers to these constructs goes away. Thus, Java has no pointer data types. Any task that would require arrays, structures, and pointers in C can be more easily and reliably performed by declaring objects and arrays of objects. Instead of complex pointer manipulation on array pointers, you access arrays by their arithmetic indices. The Java run-time system checks all array indexing to ensure indices are within the bounds of the array.

You no longer have dangling pointers and trashing of memory because of incorrect pointers, because there are no pointers in Java.

2.3 Summary

To sum up this chapter, Java is:

  • Simple--the number of language constructs you need to understand to get your job done is minimal.
  • Familiar--Java looks like C and C++ while discarding the overwhelming complexities of those languages.

Now that you've seen how Java was simplified by removal of features from its predecessors, read the next chapter for a discussion on the object-oriented features of Java.