|By Janice J. Heiss, July 2009|
Java Champion Adam Bien is a self-employed consultant, lecturer, software architect, developer, and author in the enterprise Java sector in Germany who implements Java technology on a large scale. He is also the author of several books and articles on Java and Java EE technology, as well as distributed Java programming. His latest book,
Real World Java EE Patterns - - Rethinking Best Practices
, explores the challenges of cloud computing.
In addition, he has been named a Java Rock Star for his popular session at the 2009 JavaOne conference.
We met up with him to get his latest thoughts on Java EE and cloud computing.
java.sun.com (JSC): What is your basic understanding of how cloud computing works?
Bien: I see two unrelated concepts called cloud computing. The first one is related to grid computing, where parallelizable tasks are distributed to independent computing nodes and then aggregated to a consistent result. Frameworks like Hadoop, map-reduce algorithms, are an example of this approach.
"I see two unrelated concepts called cloud computing."
The other paradigm is a virtual, private or public, data-computing center with an accessible API. You can model the environment in drag-and-drop-like fashion or, even more importantly, control it directly through a programmable API. You can, of course, run the grids in public or private clouds.
These paradigms also differ in their usage models. Grid computing is intended to be used by a few power users who need a considerable amount of computing power. On the other hand, in cloud computing, significantly more users access the machines with relatively low resource utilization.
While these two models are conceptually opposite, the underlying technology could be very similar. The first is often called platform as a service, and the latter, infrastructure as a service.
JSC: What are the advantages and disadvantages of Java EE for the cloud?
Bien: Java EE was designed to be deployed to a distributed environment. Cluster management and extensive monitoring and management capabilities are supported by major application servers.
The EJB 3 programming model encourages stateless, idempotent, and atomic or transactional design. It is interesting to note that the programming restrictions are very similar to the Google App Engine. In both cases, thread management, file access, changes to the underlying JVM* -- especially class loading, system properties, and security managers -- are not allowed.
Furthermore, both Java EE 5 and Java EE 6 come with standardized packaging -- the Enterprise Archive (EAR), which makes the provisioning of cloud apps relatively easy. And EAR solves some cloud-interoperability issues: It's a lot easier to move an app from one cloud to another. Java EE 5 and 6 are portable, so applications can be easily moved from one application server to another, regardless of whether they run in a cloud or not. They both will run on JDK 5 or higher.
The JVM itself comes with fantastic remote debugging, profiling, and monitoring capabilities. This already greatly simplifies the development of distributed apps and should also simplify cloud-enabled apps.
Java EE, however, was not designed to be operated in highly dynamic environments. Clouds tend to be elastic, and most of the current computing centers are not. There is some work to be done with respect to dynamic discovery, self-healing, and dynamic load balancing. It isn't hard with the current application server -- we did it in projects with GlassFish v2 and several years ago with JBoss.
Ironically, Jini and JXTA were always capable of behaving this way. You can simply borrow the concepts like leasing or dynamic discovery from these frameworks and apply them to a rather static Java EE environment.
JSC: Sun's Geertjan Wielenga asks in a DZone discussion: "What will applications/software look like on the cloud? Are they all web applications? Web services? Or what?" How would you answer him?
Bien: Actually, some cloud providers are not that different from on-premise software. The application just runs elsewhere.
The difference starts with the implementation of persistence services. Some cloud providers, like Google or Amazon, don't offer support for relational databases. So you either have to live with the map-like structure or try to install a relational database in the cloud. This necessitates some rethinking about how the persistence should be designed and can even impact the design of the user interface.
The obvious use case for clouds are load generators. It is very hard to simulate thousands of users on premise. Clouds are perfect for that: You can leverage the power of the cloud for your load tests and pay only for the spent CPU cycles. Without a cloud, many underutilized machines must be maintained to cover a few peaks a year.
Batch processing is another use case -- perfectly suitable for grids and clouds. The workload can be distributed across the working nodes or images -- it can scale very well.
The map-reduce algorithm works similarly: The workload is sprinkled across the cluster or cloud, and the results are aggregated afterward.
Typical web applications are also perfectly suitable for the cloud -- the request-response communication style fits very well into the cloud model. If your app is successful, the load can be handled easily, and if not, you have to pay for the idle cycles. Twitter runs on an Amazon S3 already.
Google Apps, MobileMe, Salesforce, and many others are perfect cloud-computing examples. They would, however, also perfectly run behind a corporate firewall. Cloud computing makes it easier to implement new ideas without having to invest into a data center up front. I expect new applications to appear in the near future, precisely because of this fact.
JSC: What role will rich Internet applications (RIAs) play in cloud computing in general?
Bien: We should agree on a definition of RIA in the cloud context first. In my opinion, RIAs are interactive applications with rich user experience. The richer the experience, the less likely are we to find pure web applications -- we are more likely to see native clients.
"The 'I escaped the browser trend' is already visible in Twitter. The majority of Twitter users access the stream with native clients like TweetDeck, TwitterFx, TwitterFon, Tweetie, and so on over the API and not the web interface."
Native clients run either in a browser plug-in or have to be installed on the client machine. They can operate a certain amount of time offline and do not require fine-grained server communication. In extreme cases, only the processing result could be stored into the cloud. A perfect example for this interaction model would be an email client or even an IDE.
Such applications can run easily on the premises or in the cloud. The more that processing can be done on the server, the more interesting cloud computing becomes. A video-processing or -authoring application could send the rough content to the cloud for processing and receive the results later. The processing could be parallelized and performed faster. The cloud would start new instances on demand and kill idle ones after the peak demand gets processed.
JavaFX comes with a built-in pull XML and JSON parser. It is really easy to access HTTP-based (REST, XML, JSON) services from the cloud with built-in Java FX capabilities. If this isn't sufficient, you can easily access already existing Java frameworks and libraries.
The "I escaped the browser" trend is already visible in Twitter. The majority of Twitter users access the stream with native clients like TweetDeck, TwitterFx, TwitterFon, Tweetie, and so on over the API and not the web interface.
JSC: Some argue that Java Management Extensions (JMX) is not up to the high requirements for management and monitoring in cloud computing.
Bien: JMX is basically nothing else than JavaBeans exposed to an agent and so remotely accessible. The JavaBean properties become visible to the monitoring tools, and you can invoke the management methods remotely. In addition, JDK and Java EE come with some predefined, visible JMX beans.
"Clouds are just perfect for startups. It's crazy to spend a huge amount of money for the infrastructure first, and then find out that your idea will not take off."
In either case, the data is too fine-grained for the cloud. In a virtual data center, you are interested not in a fine-grained JVM or application-server statistics, but rather in an aggregated view. The statistical data has to be organized in a hierarchical, tree-like structure. Once you do so, you will be able to get an overview first and drill down to the details for troubleshooting or fine tuning when needed.
So with JMX, you can build a perfect cloud-monitoring system. The out-of-the-box capabilities, however, are not convenient enough.
JSC: Researchers at University of California, Berkeley, state: "Provided certain obstacles are overcome, we believe Cloud Computing has the potential to transform a large part of the IT industry, making software even more attractive as a service and shaping the way IT hardware is designed and purchased. Developers with innovative ideas for new interactive Internet services no longer require the large capital outlays in hardware to deploy their service or the human expense to operate it." Your response?
Bien: I absolutely agree. Clouds are just perfect for startups. It's crazy to spend a huge amount of money for the infrastructure first, and then find out that your idea will not take off.
In the long term, hybrid data centers are likely. The constant load could be handled on premise, and the peaks processed into the cloud. This approach is even more challenging but may pay off.
This "cloud-bursting" approach requires you to dynamically reroute requests from your data center into the cloud in a "hot" way. You will have to keep your data consistent and synchronized across the data center and the cloud.
JSC: What effect will cloud computing have on open-source development?
With Puppet, you can manage thousands of nodes in a continuous-integration manner. Hyperic and openNMS help you to discover, monitor, and manage your network infrastructure. Sun's Open Cloud REST-based API is really interesting as well: You will be able to manage the cloud through an open-source standardized API.
On top of these infrastructure services, we can expect a lot of new and interesting open-source applications. Why? Partially because it will be fun to develop cloud apps with new tools, approaches, best practices, and, of course, challenges with the open-source infrastructure available.
It would be really interesting to build a JavaFX app that manages the cloud via the Sun Cloud RESTful API -- a kind of metacloud application.
JSC: IT publisher Tim O'Reilly has said that the future belongs to services that respond in real time to information provided either by their users or by nonhuman sensors. Do you agree with this? How might Java EE fit or not fit this vision of the future?
Bien: It is always a good strategy to agree with Tim O'Reilly.
We are beginning to see this trend. Interacting with a device manually requires a lot of plumbing. Being able to gather the data automatically is much more compelling. The iPhone comes with a GPS sensor that is directly used in a variety of applications like train schedules, photo applications, or Twitter location-based services. There are even gadgets available that can be directly controlled by brain waves.
The data is gathered on the device and sent back to a Java EE server. So sensor data can be considered just as data. On the other hand, Java EE is extensible with JCA (J2EE Connector Architecture). It's a widely underestimated and easy way to interact with the application server in a standard and portable way.
I used JCA in past projects exactly for that purpose -- gathering sensor data, socket data feeds, or just plain files. JCA is a part of Java EE, so Java EE can be considered as perfectly suitable for that purpose.
I used Java EE for similar purposes in my GreenFire project for managing, controlling, and reporting information in heating systems -- I gathered the sensor data from the heating, house, and solar panels -- in real time. The initial prototype was built in a weekend, so it isn't that hard.
JSC: Any closing thoughts?
Bien: Java EE has become extremely lightweight. The whole GlassFish v3 EJB 3.1 container is smaller than one megabyte, can be dynamically installed and uninstalled, and is surprisingly "elastic." You can develop and deploy an application with only a few annotations.
Also, Java EE is supported by multiple application servers, so your application is not dependent on a single vendor. Since Java EE 5, applications have become portable as well: There is no vendor-specific code or even XML configuration required.
Moving your application from one server to another is not an empty slogan. Java EE is therefore more interoperable than the cloud itself and can be used as a lean abstraction layer between the bare cloud and your business logic.
Adam Bien's Blog
Java EE Patterns
Java EE Patterns on Kenai
Better Programming With Java EE: A Conversation With Java Champion Adam Bien
Enterprise Java Expert Adam Bien Makes Java EE Programming Easier
Java SE 6 and Java EE 6 Platform the Operating System for Interactive RIAs
Seeding Cloud Computing: A Conversation With Java Champion Alan Williamson