You know you've achieved perfection in design,
Not when you have nothing more to add,
But when you have nothing more to take away.
In his science-fiction novel, The Rolling Stones, Robert A. Heinlein comments:
Every technology goes through three stages: first a crudely simple and quite unsatisfactory gadget; second, an enormously complicated group of gadgets designed to overcome the shortcomings of the original and achieving thereby somewhat satisfactory performance through extremely complex compromise; third, a final proper design therefrom.
Heinlein's comment could well describe the evolution of many programming languages. Java presents a new viewpoint in the evolution of programming languages--creation of a small and simple language that's still sufficiently comprehensive to address a wide variety of software application development. Although Java is superficially similar to C and C++, Java gained its simplicity from the systematic removal of features from its predecessors. This chapter discusses two of the primary design features of Java, namely, it's simple (from removing features) and familiar (because it looks like C and C++). The next chapter discusses Java's object-oriented features in more detail. At the end of this chapter you'll find a discussion on features eliminated from C and C++ in the evolution of Java.
Simplicity is one of Java's overriding design goals. Simplicity and removal of many "features" of dubious worth from its C and C++ ancestors keep Java relatively small and reduce the programmer's burden in producing reliable applications. To this end, Java design team examined many aspects of the "modern" C and C++ languages 1 to determine features that could be eliminated in the context of modern object-oriented programming.
Another major design goal is that Java look familiar to a majority of programmers in the personal computer and workstation arenas, where a large fraction of system programmers and application programmers are familiar with C and C++. Thus, Java "looks like" C++. Programmers familiar with C, Objective C, C++, Eiffel, Ada, and related languages should find their Java language learning curve quite short--on the order of a couple of weeks.
To illustrate the simple and familiar aspects of Java, we follow the tradition of a long line of illustrious programming books by showing you the
HelloWorld
program. It's about the simplest program you can write that actually does something. Here's
HelloWorld
implemented in Java.
class HelloWorld {
static public void main(String args[]) {
System.out.println("Hello world!");
}
}
This example declares a
class named
HelloWorld
. Classes are discussed in the next chapter on object-oriented programming, but in general we assume the reader is familiar with object technology and understands the basics of classes, objects, instance variables, and methods.
Within the
HelloWorld
class, we declare a single
method called
main()
which in turn contains a single
method invocation to display the string "Hello world!" on the standard output. The statement that prints "Hello world!" does so by invoking the
println
method of the
out
object. The
out
object is a class variable in the
System
class that performs output operations on files. That's all there is to
HelloWorld
.
The Java TM programming language follows C++ to some degree, which carries the benefit of it being familiar to many programmers. This section describes the essential features of the Java programming language and points out where the language diverges from its ancestors C and C++.
Other than the primitive data types discussed here, everything in the Java programming language is an object. Even the primitive data types can be encapsulated inside library-supplied objects if required. The Java programming language follows C and C++ fairly closely in its set of basic data types, with a couple of minor exceptions. There are only three groups of primitive data types, namely, numeric types, character types, and Boolean types.
Numeric Data Types
Integer numeric types are 8-bit
byte
, 16-bit
short
, 32-bit
int
, and 64-bit
long
. The 8-bit
byte
data type in Java has replaced the old C and C++
char
data type. Java places a different interpretation on the
char
data type, as discussed below.
There is no
unsigned
type specifier for integer data types in Java.
Real numeric types are 32-bit
float
and 64-bit
double
. Real numeric types and their arithmetic operations are as defined by the IEEE 754 specification. A floating point
literal value, like
23.79
, is considered
double
by default; you must explicitly cast it to
float
if you wish to assign it to a
float
variable.
Character Data Types
Java language
character data is a departure from traditional C. Java's
char
data type defines a sixteen-bit
Unicode character. Unicode characters are unsigned 16-bit values that define character codes in the range 0 through 65,535. If you write a declaration such as
char myChar = `Q';
you get a Unicode (16-bit unsigned value) type initialized to the Unicode value of the character Q. By adopting the Unicode character set standard for its character data type, Java language applications are amenable to internationalization and localization, greatly expanding the market for world-wide applications.
Boolean Data Types
Java added a Boolean data type as a primitive type, tacitly ratifying existing C and C++ programming practice, where developers define keywords for TRUE and FALSE or YES and NO or similar constructs. A Java
boolean
variable assumes the value
true
or
false
. A Java programming language
boolean
is a distinct data type; unlike common C practice, a Java programming language
boolean
type can't be converted to any numeric type.
2.1.2 Arithmetic and Relational Operators
All the familiar C and C++ operators apply. The Java programming language has no
unsigned
data types, so the
>>>
operator has been added to the language to indicate an unsigned (logical) right shift. Java also uses the
+
operator for string concatenation; concatenation is covered below in the discussion on strings.
In contrast to C and C++, the Java programming language arrays are first-class language objects. An array in the Java programming language is a real object with a run-time representation. You can declare and allocate arrays of any type, and you can allocate arrays of arrays to obtain multi-dimensional arrays.
You declare an array of, say,
Point
s (a class you've declared elsewhere) with a declaration like this:
Point myPoints[];
This code states that
myPoints
is an uninitialized array of
Point
s. At this time, the only storage allocated for
myPoints
is a reference handle. At some future time you must allocate the amount of storage you need, as in:
myPoints = new Point[10];
to allocate an array of ten references to
Point
s that are initialized to the null reference. Notice that this allocation of an array doesn't actually allocate any objects of the
Point
class for you; you will have to also allocate the
Point
objects, something like this:
int i;
for (i = 0; i < 10; i++)
{
myPoints[i] = new Point();
}
Access to elements of
myPoints
can be performed via normal C-style indexing, but all array accesses are checked to ensure that their indices are within the range of the array. An
exception is generated if the index is outside the bounds of the array.
The length of an array is stored in the
length
instance variable of the specific array:
myPoints.length
contains the number of elements in
myPoints
. For instance, the code fragment:
howMany = myPoints.length;
would assign the value 10 to the
howMany
variable.
The C notion of a pointer to an array of memory elements is gone, and with it, the arbitrary pointer arithmetic that leads to unreliable code in C. No longer can you walk off the end of an array, possibly trashing memory and leading to the famous "delayed-crash" syndrome, where a memory-access violation today manifests itself hours or days later. Programmers can be confident that array checking in Java will lead to more robust and reliable code.
Strings are Java programming language objects, not pseudo-arrays of characters as in C. There are actually two kinds of string objects: the
String
class is for read-only (immutable) objects. The
StringBuffer
class is for string objects you wish to modify (mutable string objects).
Although strings are Java programming language objects, Java compiler follows the C tradition of providing a syntactic convenience that C programmers have enjoyed with C-style strings, namely, the Java compiler understands that a string of characters enclosed in double quote signs is to be instantiated as a
String
object. Thus, the declaration:
String hello = "Hello world!";
instantiates an object of the String class behind the scenes and initializes it with a character string containing the Unicode character representation of "Hello world!".
Java technology has extended the meaning of the + operator to indicate string concatenation. Thus you can write statements like:
System.out.println("There are " + num + " characters in the file.");
This code fragment concatenates the string
"There are "
with the result of converting the numeric value
num
to a string, and concatenates that with the string
" characters in the file."
. Then it prints the result of those concatenations on the standard output.
String
objects provide a
length()
accessor method to obtain the number of characters in the string.
The Java programming language has no
goto
statement. To
break
or
continue
multiple-nested loop or switch constructs, you can place labels on loop and
switch
constructs, and then
break
out of or
continue
to the block named by the label. Here's a small fragment of code from the Java programming language built-in
String
class:
test: for (int i = fromIndex; i + max1 <= max2; i++) {
if (charAt(i) == c0) {
for (int k = 1; k<max1; k++) {
if (charAt(i+k) != str.charAt(k)) {
continue test;
}
} /* end of inner for loop */
}
} /* end of outer for loop */
The
continue
test
statement is inside a
for
loop nested inside another
for
loop. By referencing the label
test
, the
continue
statement passes control to the outer
for
statement. In traditional C,
continue
statements can only continue the immediately enclosing block; to continue or exit outer blocks, programmers have traditionally either used auxiliary Boolean variables whose only purpose is to determine if the outer block is to be continued or exited; alternatively, programmers have (mis)used the
goto
statement to exit out of nested blocks. Use of labelled blocks in the Java programming language leads to considerable simplification in programming effort and a major reduction in maintenance.
The notion of labelled blocks dates back to the mid-1970s, but it hasn't caught on to any large extent in modern programming languages. Perl is another modern programming language that implements the concept of labelled blocks. Perl's
next
label and
last
label are equivalent to
continue
label and
break
label statements in Java.
2.1.6 Memory Management and Garbage Collection
C and C++ programmers are by now accustomed to the problems of explicitly managing memory: allocating memory, freeing memory, and keeping track of what memory can be freed when. Explicit memory management has proved to be a fruitful source of bugs, crashes, memory leaks, and poor performance.
Java technology completely removes the memory management load from the programmer. C-style pointers, pointer arithmetic,
malloc,
and
free
do not exist.
Automatic
garbage collection is an integral part of Java and its run-time system. While Java technology has a
new
operator to allocate memory for objects, there is no explicit
free
function. Once you have allocated an object, the run-time system keeps track of the object's status and automatically reclaims memory when objects are no longer in use, freeing memory for future use.
Java technology's memory management model is based on objects and references to objects. Java technology has no pointers. Instead, all references to allocated storage, which in practice means all references to an object, are through symbolic "handles". The Java technology memory manager keeps track of references to objects. When an object has no more references, the object is a candidate for garbage collection.
Java technology's memory allocation model and automatic garbage collection make your programming task easier, eliminate entire classes of bugs, and in general provide better performance than you'd obtain through explicit memory management. Here's a code fragment that illustrates when garbage collection happens:
class ReverseString {
public static String reverseIt(String source) {
int i, len = source.length();
StringBuffer dest = new StringBuffer(len);
for (i = (len - 1); i >= 0; i--) {
dest.appendChar(source.charAt(i));
}
return dest.toString();
}
}
The variable
dest
is used as a temporary object reference during execution of the
reverseIt
method. When
dest
goes out of scope (the
reverseIt
method returns), the reference to that object has gone away and it's then a candidate for garbage collection.
2.1.7 The Background Garbage Collector
The Java technology garbage collector achieves high performance by taking advantage of the nature of a user's behavior when interacting with software applications such as the HotJava TM browser. The typical user of the typical interactive application has many natural pauses where they're contemplating the scene in front of them or thinking of what to do next. The Java run-time system takes advantage of these idle periods and runs the garbage collector in a low priority thread when no other threads are competing for CPU cycles. The garbage collector gathers and compacts unused memory, increasing the probability that adequate memory resources are available when needed during periods of heavy interactive use.
This use of a thread to run the garbage collector is just one of many examples of the synergy one obtains from Java technology's integrated multithreading capabilities--an otherwise intractable problem is solved in a simple and elegant fashion.
2.1.8 Integrated Thread Synchronization
Java technology supports multithreading, both at the language (syntactic) level and via support from its run-time system and thread objects. While other systems have provided facilities for multithreading (usually via "lightweight process" libraries), building multithreading support into the language itself provides the programmer with a much easier and more powerful tool for easily creating thread-safe multithreaded classes. Multithreading is discussed in more detail in Chapter 7.
The earlier part of this chapter concentrated on the principal features of Java. This section discusses features removed from C and C++ in the evolution of Java.
The first step was to eliminate redundancy from C and C++. In many ways, the C language evolved into a collection of overlapping features, providing too many ways to say the same thing, while in many cases not providing needed features. C++, in an attempt to add "classes in C", merely added more redundancy while retaining many of the inherent problems of C.
2.2.1 No More Typedefs, Defines, or Preprocessor
Source code written in Java is
simple. There is no
preprocessor, no
#define
and related capabilities, no
typedef
, and absent those features, no longer any need for
header files. Instead of header files, Java language source files provide the declarations of other classes and their methods.
A major problem with C and C++ is the amount of context you need to understand another programmer's code: you have to read all related header files, all related
#define
s, and all related
typedef
s before you can even begin to analyze a program. In essence, programming with
#defines
and
typedef
s results in every programmer inventing a new programming language that's incomprehensible to anybody other than its creator, thus defeating the goals of good programming practices.
In Java, you obtain the effects of
#define
by using constants. You obtain the effects of
typedef
by declaring classes--after all, a class effectively declares a new type. You don't need header files because the Java compiler compiles class definitions into a binary form that retains all the type information through to link time.
By removing all this baggage, Java becomes remarkably context-free. Programmers can read and understand code and, more importantly, modify and reuse code much faster and easier.
2.2.2 No More Structures or Unions
Java has no structures or unions as complex data types. You don't need structures and unions when you have classes; you can achieve the same effect simply by declaring a class with the appropriate instance variables.
The code fragment below declares a class called
Point
.
class Point extends Object {
double x;
double y;
//methods to access the instance variables
}
The following code fragment declares a class called
Rectangle
that uses objects of the
Point
class as instance variables.
class Rectangle extends Object {
Point lowerLeft;
Point upperRight;
//methods to access the instance variables
}
In C you'd define these classes as structures. In Java, you simply declare classes. You can make the instance variables as private or as public as you wish, depending on how much you wish to hide the details of the implementation from other objects.
Java has no
enum types. You can obtain something similar to
enum
by declaring a class whose only
raison d'etre is to hold constants. You could use this feature something like this:
class Direction extends Object {
public static final int North = 1;
public static final int South = 2;
public static final int East = 3;
public static final int West = 4;
}
You can now refer to, say, the
South
constant using the notation
Direction.South
.
Using classes to contain constants in this way provides a major advantage over C's
enum
types. In C (and C++), names defined in
enum
s must be unique: if you have an
enum
called
HotColors
containing names
Red
and
Yellow
, you can't use those names in any other
enum
. You couldn't, for instance, define another
Enum
called
TrafficLightColors
also containing
Red
and
Yellow
.
Using the class-to-contain-constants technique in Java, you can use the same names in different classes, because those names are qualified by the name of the containing class. From our example just above, you might wish to create another class called
CompassRose
:
class CompassRose extends Object {
public static final int North = 1;
public static final int NorthEast = 2;
public static final int East = 3;
public static final int SouthEast = 4;
public static final int South = 5;
public static final int SouthWest = 6;
public static final int West = 7;
public static final int NorthWest = 8;
}
There is no ambiguity because the name of the containing class acts as a qualifier for the constants. In the second example, you would use the notation
CompassRose.NorthWest
to access the corresponding value. Java effectively provides you the concept of qualified
enum
s, all within the existing class mechanisms.
Java has no
functions. Object-oriented programming supersedes functional and procedural styles. Mixing the two styles just leads to confusion and dilutes the purity of an object-oriented language. Anything you can do with a function you can do just as well by defining a class and creating methods for that class. Consider the
Point
class from above. We've added public methods to set and access the instance variables:
class Point extends Object {
double x;
double y;
public void setX(double x) {
this.x = x;
}
public void setY(double y) {
this.y = y;
}
public double x() {
return x;
}
public double y() {
return y;
}
}
If the
x
and
y
instance variables are private to this class, the only means to access them is via the public methods of the class. Here's how you'd use objects of the
Point
class from within, say, an object of the
Rectangle
class:
class Rectangle extends Object {
Point lowerLeft;
Point upperRight;
public void setEmptyRect() {
lowerLeft.setX(0.0);
lowerLeft.setY(0.0);
upperRight.setX(0.0);
upperRight.setY(0.0);
}
}
It's not to say that functions and procedures are inherently wrong. But given classes and methods, we're now down to only one way to express a given task. By eliminating functions, your job as a programmer is immensely simplified: you work only with classes and their methods.
2.2.5 No More Multiple Inheritance
Multiple inheritance--and all the problems it generates--was discarded from Java. The desirable features of multiple inheritance are provided by interfaces--conceptually similar to Objective C protocols.
An interface is not a definition of a class. Rather, it's a definition of a set of methods that one or more classes will implement. An important issue of interfaces is that they declare only methods and constants. Variables may not be defined in interfaces.
Java has no
goto
statement
1. Studies illustrated that
goto
is (mis)used more often than not simply "because it's there". Eliminating
goto
led to a simplification of the language--there are no rules about the effects of a
goto
into the middle of a
for
statement, for example. Studies on approximately 100,000 lines of C code determined that roughly 90 percent of the
goto
statements were used purely to obtain the effect of breaking out of nested loops. As mentioned above, multi-level
break
and
continue
remove most of the need for
goto
statements.
2.2.7 No More Operator Overloading
There are no means provided by which programmers can overload the standard arithmetic operators. Once again, the effects of operator overloading can be just as easily achieved by declaring a class, appropriate instance variables, and appropriate methods to manipulate those variables. Eliminating operator overloading leads to great simplification of code.
2.2.8 No More Automatic Coercions
Java prohibits C and C++ style automatic coercions. If you wish to coerce a data element of one type to a data type that would result in loss of precision, you must do so explicitly by using a cast. Consider this code fragment:
int myInt;
double myFloat = 3.14159;
myInt = myFloat;
The assignment of
myFloat
to
myInt
would result in a compiler error indicating a possible loss of precision and that you must use an explicit cast. Thus, you should re-write the code fragments as:
int myInt;
double myFloat = 3.14159;
myInt = (int)myFloat;
Most studies agree that pointers are one of the primary features that enable programmers to inject bugs into their code. Given that structures are gone, and arrays and strings are objects, the need for pointers to these constructs goes away. Thus, Java has no pointer data types. Any task that would require arrays, structures, and pointers in C can be more easily and reliably performed by declaring objects and arrays of objects. Instead of complex pointer manipulation on array pointers, you access arrays by their arithmetic indices. The Java run-time system checks all array indexing to ensure indices are within the bounds of the array.
You no longer have dangling pointers and trashing of memory because of incorrect pointers, because there are no pointers in Java.
To sum up this chapter, Java is:
Now that you've seen how Java was simplified by removal of features from its predecessors, read the next chapter for a discussion on the object-oriented features of Java.