Information as a Service

Bradley Wright interviewed by Jon Mountjoy

You may have lots of data in your enterprise and no easy way to integrate and access it. If you are adopting SOA, integrating and accessing your data is a big part of succeeding. With an information bus, making accurate, consistent data available to those who need it is greatly simplified.

This vision is something Brad Wright, Product Manager for the BEA AquaLogic Data Services Platform, often speaks about. Arch2Arch interviewed Brad to find out more about delivering information as a service.

What is "Information as a Service"?

Jon: We hear terms like "information as a service" or "data services" or "data virtualization" more and more. What do those terms mean and how do they fit into SOA?

Brad Wright: All of those terms mean delivering the enterprise's information, in a business-friendly format, to consumers throughout the organization. It means getting the right information in the right format to the right people in the right manner at the right time. Those terms describe how the problems of data access and integration fit into SOA. Let's face it, all services are consuming and operating on data. Data services is an approach for extending the principles of SOA, loose coupling and reuse, to data.

Jon: What kind of information are you talking about?

Brad Wright: I'm talking about information sitting in databases, files, spreadsheets, and legacy systems—data that is typically in hard-to-understand formats and structures. When you're adopting a strategy of information as service, one of the first things you do is service-enable that data, making the data easy to consume.

Creating Data Services

Jon: So how do you create a data service?

Brad Wright: First, you inventory the metadata describing your sources using the import wizard of AquaLogic Data Services Platform (ALDSP). This first step gives ALDSP the information it needs to access the data and expose it. You could stop at this point and have a data service, albeit one that structurally looks a lot like the underlying source. But the real value is in creating logical data services that possibly integrate data from multiple sources into a business-friendly object (for example, Customer). You accomplish that in ALDSP by modeling the logical service and mapping source data into it using a drag-and-drop, visual editor. No coding!

Jon: Can you please make this a little more concrete for me? Say I have an XML file sitting somewhere on a disk, updated every month by accounts. How do I "service-enable" it?

Brad Wright: Service enabling means making the data in that XML file (or database, or legacy source) accessible in a standard manner for consumers. For some consumers that will mean accessible as a SOAP-based Web service. For others, it will mean accessible as a Java Service Data Object. Still others might see the XML file data as a relational table exposed through our JDBC driver. Service enabling is the "get the data in the right format and right manner" part of the strategy.

Jon: You talk about "business-friendly formats." What does that mean in this example?

Brad Wright: It means taking the naming and organization of the data in the XML file and making it easier for the business to consume by standardizing the names of the data items into accepted business terms and organized in ways that are meaningful and useful for the business. The fact of the matter is that data is typically named and organized in sources for the benefit of the DBA or underlying data storage technology, not for the benefit of the consumer. Maybe some field in a source is called CRMId. Why shouldn't the consumer see that data named more meaningfully? Say, something like CustomerID?

Jon:I guess you don't need to service-enable all data. Where do you draw the line?

Brad Wright: Information as a service is appropriate for any organization that has multiple data sources and multiple consuming applications. If your organization is small and you have one database and one main application, then information as a service may not make sense. But if your small organization is in a rapidly growing market or you expect to be acquired by or to acquire another organization, the information bus can make your business more agile and more competitive. Certainly, if you have a many-to-many problem—many data sources and many consuming apps—you can reap value with an information-as-a-service approach.

The Information Bus

Jon: You mentioned "the information bus." Does ALDSP work with AquaLogic Service Bus?

Brad Wright: Absolutely. ALDSP combines with AquaLogic Service Bus (ALSB) to create what we call an information bus. Together these technologies make data consistent and reusable in a service-oriented way. You design and implement information services using ALDSP and you provision those services using ALSB to make them available to services consumers throughout the organization.

Jon: What are more of the broad benefits of an information bus?

Brad Wright: The high-level benefits are business transformation and growth maximization through data consistency and agility. An information-as-a-service strategy is about making data or information easier to consume throughout the organization. That means lower time to solution, lower cost of development, and lower total cost of ownership. When an application needs data, you no longer have to build a siloed, data-access solution to meet just its needs. You just go to the bus and invoke the desired information service to get the data needed.

Jon: Do I have to use ALSB or can I simply use ALDSP to expose the data?

Brad Wright: ALDSP works by itself to integrate and expose data services to Web services, JDBC, or Java clients all on its own, so you don't have to use ALSB. But, as part of an enterprise SOA effort, ALSB adds great governance and SLA management capabilities for those ALDSP data services. The two products work fine separately but great together!

Jon: I guess if you're using ALSB, then you'll also get the other benefits too, like SLA management, right?

Brad Wright: Right! ALDSP has critical features about data—accessing, integrating, transforming, securing. ALSB has critical features about service management—visibility, registration with UDDI, SLA measurement, and alerting. Together, you get a lot of value.

Control Over Your Data

Jon: I guess if you have a service-based approach, then you have more control over those services and data, so can avoid more ad-hoc information retrieval?

Brad Wright: Yes. In addition to making data easier to consume, information as a service can also help ensure that the data consumed is consistent and accurate.

I'll give you a real example. I get two copies of a credit card offering from the same bank on the same day of the month. Why? Because they have two different systems with data about me in them after one bank bought the other. I just throw them away and wonder why the bank doesn't fix this. This bank has key business processes operating on different data. If they adopted an information-as-a-service approach, there would be only one service to get reconciled, accurate customer data from. So the strategy helps you deliver consistent, accurate, reusable information to the organization.

Jon: You mentioned Service Data Objects earlier. Can you tell us more about that?

Brad Wright: Sure. SDO is a standard proposed by BEA and IBM several years ago that describes how data services should look to a Java consumer. After all, not everybody wants or needs to consume data services as Web services. And they definitely don't want to deal directly with XML. The SDO standard describes how a data service maps to a strongly-typed Java object with familiar getter and setter methods. SDO uses a disconnected data transfer pattern that tracks changes made to the data so ALDSP can provide great optimistic locking capabilities.

Jon: If we make information easier to consume and more readily available, how do we still keep data secure?

Brad Wright: Great question. Based on the role of the person requesting information, you may need to return only a subset of the data in a given service. An example is social security numbers (SSNs). Organizations are supposed to treat SSNs with some sensitivity, so maybe you don't want a call center rep to see SSNs. So you say, "For this role, never show them these parts of the service response." That is a type of security that ALDSP provides. You don't have to use those rules and policies if you don't need them, but they're there for exactly these reasons.

Jon: How is the AquaLogic Data Services Platform different from similar technologies?

Brad Wright: One important difference is that ALDSP has query-optimization technology—technology that optimizes what DSP should do for each data request. After all, information services may be used in a variety of contexts by a variety of consumers. So, one of the practical questions we often get asked is "How can I meet a variety of needs across diverse applications from a single information service about 'customer'?" The answer is you can't—unless the technology can optimize that request on a per-usage basis.

For example, if I have a customer information service that presents my business-friendly view of "customer" including customer name, address, and phone number, and application A needs just name and phone number while application B needs name and address, are both of those request going to result in name, address, and phone number being retrieved? No.

In the first case, ALDSP knows we only need name and number, so it optimizes for that. In the second case, although it's the same data service, it returns only name and address. Query optimization technology is different from a number of the data-access technologies out there, which are really just about generating code for accessing a given table—code that doesn't optimize for different requests. ALDSP is much smarter than that and the benefit is much better performance and less impact on your backend systems.

Adopting Information As a Service

Jon: How do you go about adopting this strategy?

Brad Wright: Picking an adoption approach can be tough. Should companies make a massive up-front investment in designing a comprehensive information service layer and get it right all at once, or take a project-by-project approach and possibly get it wrong because their perspective was too narrow?

There isn't a one-size-fits-all answer, but we've seen customers succeed by starting with a six- to eight-month timeframe where they address a cluster of projects and build an initial information-services layer to meet the needs of those projects. From that foundation, they make incremental changes as needed for the next cluster of projects. But you need a solid governance and planning model to manage the approach.

Jon: So do you see the need for an "Information Architect" when adopting information as a service? Or do such roles already exist albeit in a different form?

Brad Wright: We do see that there are key data modeling skills that are part of creating useful data services. It's somewhere between physical source modeling and enterprise data modeling and is a combination of skill in understanding relational concepts like rows and joins with skill in thinking about data in logical terms. We find that every organization has people with these skills but perhaps not the title of Information Architect or Data Architect. Information as a service is about finding those folks and empowering them with a technology like ALDSP! Getting those folks working with your SOA architects can deliver all the promise of SOA.

Jon Mountjoy worked as the editor-in-chief of Dev2Dev and Arch2Arch until April 2008.

Bradley Wright is Product Manager for the AquaLogic Data Services Platform.