Patterns and Strategies for Building Document-Based Web Services: Part 1 in a Series

   
By Sameer Tyagi, September, 2004  

[Page 1] [ Page 2] [ Page 3] [ Page 4] [ Page 5] [ Page 6]

The Java programming language provides developers with the ability to write portable code quickly and efficiently, and XML provides a mechanism to describe data in a portable format. Java and XML therefore form a natural combination and choice for developing web services. This is reinforced though the strong support for developing web services in the Java programming language by the numerous Java specifications addressing different APIs and standards in the Java Community Process.

Java 2 Platform, Enterprise Edition (J2EE) is a standard Java platform for deploying enterprise applications and is specified as a set of required APIs, specifications and policies. The J2EE 1.4 platform provides integrated support for web services, which is one of the many service delivery channels in the platform. The Java API for XML-based RPC (JAX-RPC), a part of J2EE 1.4, is the Java API of choice for developing and using service endpoints based on SOAP that are described using WSDL. Although JAX-RPC and its name are based on the RPC model, it offers features that go beyond basic RPC. It is possible to develop web services that pass complete documents and also document fragments.

Architects and developers are rapidly adopting the use of XML as the data format and a J2EE-based technology stack to develop web services that either expose core business functionality to business partners, or as a mechanism to integrate applications within the enterprise and eliminate vertical application silos. It is therefore important for them to understand how such document-driven web services can be built using JAX-RPC, the different architectural choices, and the associated tradeoffs. This document outlines some of these alternatives and best practices that architects and developers should keep in mind when building document-driven web services using JAX-RPC on the J2EE platform.

Content
Code Sample


Document-Based Interactions

A web service typically exposes coarse-grained enterprise services that encapsulate some core business and relies on XML-based technologies to do so. The service consumer can interact with the service in two common patterns.

In the RPC-based interaction, the web service is viewed by the consumer as a single logical application or component with encapsulated data, where the WSDL described by the publicly-exposed interface and the XML in the SOAP messages exchanged is formatted to map to the discrete operations exposed by that application. In fact, the messages directly map onto input and output parameters of the procedure calls or operations. Typically, such invocations occur over a synchronous transport protocol like HTTP, where the SOAP request and response is piggybacked on the protocol-level request-and-response, respectively, to form synchronous request-response interaction patterns. For example, see Figure 1, which illustrates a payment service that accepts payments and returns a status, or a stock quote service that accepts a ticker symbol and returns the current quote in the HTTP response.

 

In a document-based interaction, the service consumer interacts with the service using documents that are meant to be processed as complete entities. These documents typically take the form of XML, which is defined by a commonly agreed upon schema between the service provider and service consumer. It is also possible that the document exchanged in such an interaction could be in a format other than XML (such as encrypted files); however, the value of agreeing on a XML schema is to facilitate interoperability. In other words, the document represents a complete unit of information and may be completely self-describing.

For example, consider a transporter's web service that accepts bid requests from a shipping company and replies back with an appropriate bid (see Figure 2). Such document-based interactions are typically long-lived in nature; the transport company may need to execute a long-running business process in order to respond appropriately to the request. Because the response cannot be returned immediately over the underlying transport protocol's response (like in the RPC invocation described earlier), document-based message exchanges are more common to asynchronous communication architectures. Also, the effort and complexity involved in building a document-oriented web service is usually more than the effort involved in using an RPC-based architecture. This is because it involves extra steps, such as designing the schema for the documents that will be exchanged, negotiating and arriving at an agreement with business partners on that design, and validating the document against the schema.

 

In many applications, architects and developers need to build out document-based interactions in their web services to wrap and expose existing business processes to their business partners. In such cases, they sometimes need direct access to the XML document to perform activities like validation against schemas, validation against business rules, reporting, or archiving.

Formatting vs. Processing

It is important to understand that design of an RPC or document- based web service is orthogonal to the formatting and the representation of the SOAP message on the wire. This is often a cause for confusion, because of the document-literal and RPC-encoded nomenclature used to describe the formatting. A SOAP message on the wire can be represented in either RPC style or document style. This choice is governed by the value of the style attribute in the WSDL file.

When the style attribute is set to " rpc", the child elements in the body, (such as inside the SOAP <body> tag) are construed by the endpoint the XML representation of a method call. The rules for this are defined in section 7 of the SOAP 1.1 note. For example, the first XML element is named after the operation. The element is the name of the operation for a request ( <sayHello> for a sayHello invocation, and each child will be a method parameter), has the word Response appended to it for an operation response (for example, <sayHelloReponse> will wrap the return data from the invocation).

When the style attribute is set to "document", the SOAP body can contain arbitrary XML and the endpoint does not follow the RPC-related rules from section 7 of the SOAP note.

Encoding refers to how data is serialized and sent over the wire. This is specified by the use attribute in the WSDL, which can have a value of encoded or literal. The parties in the web services exchange can agree on a predefined encoding scheme or use an XML schema directly to define the data types. SOAP, for example, defines an encoding scheme that specifies how strings, primitives, or arrays can be represented in XML. An encoded SOAP body indicates that the rules to encode and interpret a SOAP body are in a URL specified by the encodingStyle attribute. A literal message indicates that the rules to encode and interpret the SOAP body are specified by an XML schema. Thus, based on the value of the style and use attributes in WSDL, there are four combinations of binding style and data encoding. These are:

  • RPC-Encoded
  • RPC-Literal
  • Document-Literal
  • Document-Encoded

The JAX-RPC specifications require that implementations support the first three modes only; document-encoded is not required.

Just because an application uses the document-literal formatting does not mean that the communication pattern is asynchronous, message-based, or one-way. It also does not mean that an RPC-based web service mandates the use of the RPC-Encoded formatting in WSDL. One does not need to use the RPC-Encoded formatting for an RPC-based application, and the cause for this confusion has been the initial adoption of this formatting by the different SOAP toolkits to expose method-level RPC style services. In other words, one can use an RPC-style programming model in the application and still generate document-literal formatted messages on the wire. In fact, the choice of the formatting in WSDL from one of the four possible combinations should be transparent to the developer. At least conceptually, it should be just be a deploy time or configuration option with no effect on the behavioral characteristics of the service.

Example

The style attribute of the <soap:binding> or <soap:operation> element and use attribute of the <soap:body> element in the WSDL file can be used to describe the four combinations of invocation mentioned above. The code in the <StyleExample> directory demonstrates and executes the simple service that has one operation using the three JAX-RPC supported modes. The service can be deployed in document-literal mode using the -f:documentliteral option, for RPC-literal the switch is –f:rpcliteral and when no explicit mode is specified, the deployment defaults to the RPC-encoded formatting. The Service Endpoint Interface for this example is shown in Code Sample 1 and the different Ant targets are shown in Code Sample 2.

Code Sample 1: Service Interface for the StyleExample

package com.examples.xmlstring;
import java.rmi.Remote;
import java.rmi.RemoteException;
public interface IStringService extends Remote {
    public String sayXMLHello(String xml) throws RemoteException;
}

Code Sample 2: Ant Targets for the StyleExample

$JAX-RPC-Examples ant
Buildfile: build.xml
usage:
    [echo] Usage : ant target-name
    [echo] Valid target names are :
    [echo] clean                 --> Deletes the build directory
    [echo] deploy-run-docliteral --> Deploys and runs the service in doc-literal mode
    [echo] deploy-run-rpcliteral --> Deploys and runs the service in rpc-literal mode
    [echo] deploy-run-rpcencoded --> Deploys and runs the service in rpc-encoded mode
    [echo] undeploy-docliteral   --> Undeploys the Tomcat context and deletes the 
                                     deployed doc-literal service
    [echo] undeploy-rpcliteral   --> Undeploys the Tomcat context and deletes the 
                                     deployed rpc-literal service
    [echo] undeploy-rpcencoded   --> Undeploys the Tomcat context and deletes the 
                                     deployed rpc-encoded service

The WSDL fragments and SOAP requests for different combinations are shown in Code Sample 3. The document-encoded WSDL and SOAP request were generated externally. In the examples, the namespace "http://www.examples.com/types" refers to the namespace for the data types in the schema and the namespace "http://www.examples.com/wsdl/StringService" refers to the targetNamespace of the WSDL, which is the logical namespace for information about the web service.

With RPC-encoded formatting (Code Sample 3a-b), the SOAP body refers to an operation named " sayXMLHello" and the contents are encoded using the SOAP encoding as described in the encodingStyle. In the RPC-Literal formatting, the body still refers to an operation named " sayXMLHello"; however the arguments are now passed using their literal representation (Code Sample 3c-d).

With the document-literal formatting (Code Samples 3 e-f), there is no reference to any RPC operation. The " sayXMLHello" is now a complex type as declared in the WSDL, and is passed using its literal representation. In the Document-Encoded case, there is still no reference to an RPC operation; however the data is now encoded using the SOAP Encoding. For closer inspection, the complete WSDL files for all for cases are included in the < StyleExample/configs> directory.

Code Sample 3: WSDL for RPC-Encoded Formatting

<binding name="IStringServiceBinding" type="tns:IStringService">
    <soap:binding transport="http://schemas.xmlsoap.org/soap/http"  
                                         style="rpc"                   />
    <operation name="sayXMLHello">
      <soap:operation soapaction=""/>
      <input>
        <soap:body encodingstyle="http://schemas.xmlsoap.org/soap/encoding/"
                 
                                         use="encoded"                    namespace="http://www.examples.com/wsdl/StringService"/>
     </input>
      <output>
        <soap:body encodingstyle="http://schemas.xmlsoap.org/soap/encoding/" 
                 
                                         use="encoded"                    namespace="http://www.examples.com/wsdl/StringService"/>
      </output>
     </operation>

</binding>
                

Code Sample 3-a: SOAP Request for RPC-Encoded Formatting

<?xml version="1.0" encoding="UTF-8"?>
<env:envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:enc="http://schemas.xmlsoap.org/soap/encoding/" 
xmlns:ns0="http://www.examples.com/wsdl/StringService"
    env:encodingstyle="http://schemas.xmlsoap.org/soap/encoding/">
        <env:body>
                <ns0:sayxmlhello>

                        <string_1 xsi:type="xsd:string">Hello World</string_1>
                </ns0:sayxmlhello>
        </env:body>
</env:envelope>

Code Sample 3-b: WSDL for RPC-Literal Formatting

<binding name="IStringServiceBinding" type="tns:IStringService">
    <soap:binding transport="http://schemas.xmlsoap.org/soap/http"  
                                         style="rpc"                   />
    <operation name="sayXMLHello">
      <soap:operation soapaction=""/>
      <input>
        <soap:body  
                                         use="literal"                    namespace="http://www.examples.com/wsdl/StringService"/>
      </input>
      <output>
        <soap:body  
                                         use="literal"                    namespace="http://www.examples.com/wsdl/StringService"/>
      </output>
</operation>
</binding>
                

Code Sample 3-c: SOAP Request for RPC-Literal Formatting

<?xml version="1.0" encoding="UTF-8"?>
<env:envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:enc="http://schemas.xmlsoap.org/soap/encoding/"
   xmlns:ns0="http://www.examples.com/wsdl/StringService">
        <env:body>
                <ns0:sayxmlhello>
                        <string_1>Hello World</string_1>
                </ns0:sayxmlhello>
        </env:body>
</env:envelope>

Code Sample 3-d: WSDL for Document-Literal Formatting

  <binding name="IStringServiceBinding" type="tns:IStringService">
    <soap:binding transport="http://schemas.xmlsoap.org/soap/http"  
                                         style="document"                   />

    <operation name="sayXMLHello">
      <soap:operation soapaction=""/>
      <input>
        <soap:body  
                                         use="literal"                   />
      </input>
      <output>

        <soap:body  
                                         use="literal"                   />
       </output>
   </operation>
</binding>
                

Code Sample 3-e: SOAP Request for Document-Literal Formatting

<?xml version="1.0" encoding="UTF-8"?>
<env:envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:enc="http://schemas.xmlsoap.org/soap/encoding/"
    xmlns:ns0="http://www.examples.com/types">
        <env:body>
                <ns0:sayxmlhello>
                        <string_1>Hello World</string_1>
                </ns0:sayxmlhello>
        </env:body>
</env:envelope>   

Code Sample 3-f: WSDL for Document-Encoded Formatting

<binding name="IStringServiceBinding" type="tns:IStringService">
        <soap:binding transport="http://schemas.xmlsoap.org/soap/http"  
             
                                         style="document"/>                   
    <operation name="listScheduledPayments">
    <soap:operation soapaction=""/>

      <input>
        <soap:body encodingstyle="http://schemas.xmlsoap.org/soap/encoding/" 
                      
                                         use="encoded"                    namespace="http://www.examples.com/wsdl/StringService"/>
      </input>
      <output>
        <soap:body encodingstyle="http://schemas.xmlsoap.org/soap/encoding/" 
                      
                                         use="encoded"                    namespace="http://www.examples.com/wsdl/StringService"/>
      </output>
</operation> 
</binding>
                

Code Sample 3-g: SOAP Request for Document-Encoded Formatting

<?xml version="1.0" encoding="UTF-8"?>
<env:envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:enc="http://schemas.xmlsoap.org/soap/encoding/"
    xmlns:ns0="http://www.examples.com/types">
        <env:body>
                <ns0:sayxmlhello>
                        <string_1 xsi:type="xsd:string">Hello World</string_1>
                </ns0:sayxmlhello>
        </env:body>
</env:envelope>

Some of the decision points that developers can consider when making a decision on the formatting style for the web service to use are:

  1. State maintenance: If the stubs generated by a toolkit cannot maintain state, then document style can be used to pass the contents of an entire transaction as an XML document. The service implementation can then ensure the processing sequence and maintain state in the execution of that sequence.
  2. Industry standard schemas: If the service consumer is only requesting information or persisting information in a pre-defined format, such as those defined by industry standards bodies (eg STAR standards), a document style message makes more sense because it is not constrained by the RPC-oriented encoding.
  3. Validate business documents: With document style, a web service endpoint can use the capabilities of a validating parser and the runtime to perform syntactic validation on business documents against their schema definitions. In order to enforce similar validation with RPC, a message must include an XML document as a string parameter or attachment and implement the validation in the service. This can often lead to invocations that are not detected until the entire structure has been processed. In short, if the service is accepting or returning a complex XML structure, a document style is better suited, since the XML can be validated against the schema prior to calling the service.
  4. Performance and memory limitations: Marshalling and un-marshalling parameters to XML in memory can be an intensive process. Typically, the RPC-encoded scheme is the least performing because of the extra processing overhead in encoding the payloads. Also, the SOAP model inherently requires DOM-based processing of the envelope, which can lead to large DOM trees in memory if the XML representation is complex. However, document style services can choose alternate parsing technologies like SAX and StAX to optimize and improve performance, which can be a critical factor for services that handle many simultaneous requests.
  5. Interoperability: There is a natural tendency to expose the programming language object structures through the WSDL when using RPC style and this causes interoperability issues across platforms. To facilitate interoperability, the WS-I Basic Profile limits the use of the encoding (RPC-encoded or document-encoded) and encourages a literal formatting (document-literal or RPC-literal style). Of the two literal styles, some toolkits today like .NET only support document-literal, and if the web service wants to interoperate with service consumers that use such toolkits, document-literal is the natural choice.

Document style combined with literal encoding allows validation; changing that to RPC-literal takes that benefit away because the surrounding RPC element does not appear in the schemas. A possible example where RPC-literal may be used instead of document-literal is when multiple RPC operations return XML documents using the same schema. Document-encoded takes away the benefits of RPC-encoded but does not add anything in return.

Strategies

Development of a document-driven web service typically starts with the definition of the schemas and the WSDL describing the document exchange. For example, a document-based web service that accepts purchase orders (Figure 3) would first typically start by defining the schemas for the XML documents exchanged.

 

Code Sample 4 shows the XML schema PurchaseOrder.xsd for the purchase order, Code Sample 5 shows the PurchaseOrderStatus.xsd schema for the document returned, and Code Sample 6 shows the POProcessingProblem.xsd schema for documents indicating an error in processing the purchase order. These three schemas will be used across all the examples discussed in this article.

Code Sample 4: Purchase Order Schema

<?xml version="1.0" encoding="UTF-8"?>
<schema targetnamespace="http://www.examples.com/types" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns="http://www.w3.org/2001/XMLSchema" 
xmlns:tns="http://www.examples.com/types">
        <element name="PurchaseOrderDocument" type="tns:PurchaseOrder"/>
        <complextype name="Address">
                <sequence>
                        <element name="street" type="string" nillable="false"/>
                        <element name="city" type="string" nillable="false"/>

                        <element name="state" type="string" nillable="false"/>
                        <element name="zipCode" type="string" nillable="false"/>
                </sequence>
        </complextype>
        <complextype name="LineItem">
                <sequence>
                        <element name="itemname" type="string" nillable="false"/>
                        <element name="price" type="decimal" nillable="false"/>
                        <element name="quantity" type="int" nillable="false"/>
                </sequence>
        </complextype>
        <complextype name="PurchaseOrder">
                <sequence>
                        <element name="billTo" type="tns:Address" nillable="false"/>
                        <element name="createDate" type="dateTime" nillable="false"/>
                        <element name="items" type="tns:LineItem"
                             nillable="false" minoccurs="0" maxoccurs="unbounded"/>
                        <element name="poID" type="string" nillable="false"/>
                        <element name="shipTo" type="tns:Address" nillable="false"/>
                </sequence>
        </complextype>
</schema>

Code Sample 5: Purchase Order Status Schema

<?xml version="1.0" encoding="UTF-8"?>
<schema targetnamespace="http://www.examples.com/types" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://www.w3.org/2001/XMLSchema" 
xmlns:tns="http://www.examples.com/types">
        <element name="Status" type="tns:PurchaseOrderStatus"/>
        <complextype name="PurchaseOrderStatus">
                <sequence>
                        <element name=orderid" type="string" nillable="false"/>
                        <element name="timestamp" type="string" nillable="true"/>
                </sequence>
        </complextype>
</schema>

Code Sample 6: POProcessingProblem Schema

<?xml version="1.0" encoding="UTF-8"?>
<schema targetnamespace="http://www.examples.com/types" 
xmlns:tns="http://www.examples.com/types"
xmlns="http://www.w3.org/2001/XMLSchema" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <element name="POProcessingFault" type="tns:POProcessingProblem"/>
        <complextype name="POProcessingProblem">
                <sequence>
                        <element name="message" type="string" nillable="true"/>
                </sequence>
        </complextype>
</schema>

The following strategies can be employed to realize this solution:

  1. Using XML in the SOAP body
    1. Starting with a WSDL
    2. Starting with Java code
  2. Using String in the SOAP body
  3. Using base64 Encoded or raw bytes in the SOAP body
  4. Using no data binding
  5. Using the xsd:any element in WSDL
  6. Using the xsd:anyType in WSDL
  7. Using an external URI to reference the business document
  8. Using message attachments in the SOAP message.

The next section shows detailed examples, code and consequences of employing each of these strategies.

[Page 1] [ Page 2] [ Page 3] [ Page 4] [ Page 5] [ Page 6]