As Published In
Oracle Magazine
January/February 2006

TECHNOLOGY: XML


Know Your UML with XML

By Robert Wigetman and Jurgen Moortgat

Learn how Oracle XML DB can increase ROI on UML models of software systems.

Software technology has evolved to the point where big systems have become so complex that they are no longer simply specified in human language and programmed in computer language. Indeed, using UML (Unified Modeling Language) to model new software has become a de facto standard approach to building big systems.

The basic outputs of a software system model are code/documentation artifacts containing a partial view of the knowledge in the system. Today's modeling tools perform quite well at code generation and reverse engineering, but they are weak in their introspective capabilities, supporting only very simple query facilities that depend entirely on the features of the specific modeling tool at hand. At Eurocontrol-CFMU we have been discouraged by these limitations. Indeed, when we have devoted great effort to modeling, it has been difficult, if not impossible, to perform anything but the simplest queries on the contents of the models.

Suppose we want to audit our system and find out what datatypes are used in the system's public interfaces. We would like the model to respond to a query of the form "For each class with stereotype 'interface', in a package whose name is of the form '*public*', display the list of all datatypes of the parameters for all operations." Furthermore, we would like to use a standard query language such as SQL, based on the UML model's conceptual elements, and thereby save our staff from having to learn a UML tool's proprietary scripting language.

While developing UML models, we realized that the limits of the modeling tool chain were holding us back. We kept coming back to the query question and found ourselves saying, "If only we had a database!" This is how we began to explore the transformation of UML models into software system databases. Our business is to develop high-availability air-traffic-flow and capacity management systems; we are not a big software house and cannot devote serious development effort to the elaboration and maintenance of a modeling tool chain.

This article describes our transformation solution: how we used Oracle Database 10g to turn a UML model into a database of software system knowledge and thus made the information readily accessible. We provide a step-by-step procedure to allow readers to apply this same approach in their environment and thus increase the benefit of their UML models.

Overview

The steps in transforming our UML model into a software system knowledgebase constitute a remarkable procedure requiring no programming. It involves loading a UML model into Oracle XML DB and thereby provides full SQL query facilities—in other words, a genuine software system knowledgebase.

We must stress that no software development is required to support this transformation. Indeed, all the tooling required is readily available in the modeling tools themselves and in commercial off-the-shelf (COTS) XML editors (such as XMLSpy), and the core of the knowledgebase is Oracle Database 10g's object-relational kernel. Basically, anyone who makes UML models and runs Oracle Database 10g can apply the same transformation process.

 

figure 1
Figure 1: Knowledgebase process architecture


Figure 1 presents the overall architecture of the transformation process as well as the data flow involved in the production of reports and other potential outputs of the software system knowledgebase.

This article presents the process we developed to transform a UML model into an Oracle XML DB-based software system knowledgebase. This process comprises the following steps: 

  1. Extract an XML version of the UML model (XMI model file).

  2. Generate the XML Schema Definition (XSD) file.

  3. Create the Oracle XML DB repository directory structure.

  4. Register the XSD in Oracle XML DB.

  5. Upload the XML version of the model into Oracle XML DB.

  6. Define atomic views.

  7. Query the knowledgebase via SQL or any Oracle Database 10g API.

Examples in this article are taken from our model files, which may not correspond exactly to those generated by a different modeling toolset.

Step 1: Extract an XML Version of the UML Model 

In this step, the UML modeling tool creates a single XML document containing the contents of the UML model. This document adheres to the XML Metadata Interchange (XMI) standard. At the time we were working on this project, only the DTD-based version, 1.2, of the XMI standard was available. Since then the Object Management Group (OMG) has released version 2.0, which is XSD-based. Having a modeling tool that supports the new schema-based XMI 2.0 standard is advantageous, but it is not necessary. Any XML version of the UML model is sufficient for the rest of the transformation process.

The XML document provided by the UML tool is the input to the next step in the process. Let's name the document model.xml.

Step 2: Generate the XML Schema Definition

Before you upload the model.xml file into Oracle XML DB, you must register the file structure. To do this, provide Oracle XML DB with a schema definition file, which describes, in a useful manner, the structure of the XML document.

We were not able to find a useful schema definition corresponding to the XMI 1.2 standard, so we generated a schema definition from the XML file itself. There are many COTS tools that can do this—each with its own set of minor issues. The tool we used for this operation generated XSD that contained the colon (:) character embedded in element names. Because this is invalid, we globally replaced the embedded colons with underscores to obtain a well-formed XSD.

After you create and clean up—if necessary—the model.xsd file, perform the following operations: 

  1. Add the Oracle XML DB namespace to the XSD. This is required when registering an XSD into Oracle XML DB: 

  2. <xs:schema 
    xmlns:xs="http://www.w3.org/
    2001/XMLSchema"
    xmlns:xdb=http://xmlns.oracle.com/xdb
    xdb:storeVarrayAsTable="true">
    

  3. Add an Oracle XML DB defaultTable attribute to the root element, as follows:

  4. <xs:element name="XMI" 
    xdb:defaultTable="MODEL_NAME">
    ...
    </xs:element>
    


    This is not mandatory, but it instructs Oracle Database to generate an Oracle table called MODEL_NAME as the main table. (If this information is not provided, Oracle Database will generate a table name that will not be very user-friendly.)

  5. Make sure that string-typed elements are stored in CLOB columns and not in VARCHAR2(4000) columns (the default). This is to ensure that large amounts of text data will not be rejected because of size limitations. The elements containing large amounts of text data should have the following attribute: 

    xdb:SQLType="CLOB" 
    

    This means that every occurrence of the attribute-value pair type="xs:string" must be accompanied by the above CLOB definition. For example,  

    <xs:attribute name="extendedElement"
    type="xs:string"
    xdb:SQLType="CLOB"
    ...
    />
    ...
    <xs:element name="UML_TaggedValue.value"
    type="xs:string"
    xdb:SQLType="CLOB"
    ...
    />
    

  6. Remove all irrelevant enumeration lists, such as xmi.id . When you generate the XSD from the XML document, enumeration lists might be generated. These lists are irrelevant and should therefore be removed and replaced by the actual datatype (for example, xmi.id has the datatype String ). Listing 1 shows the attribute xmi.id (which is used for almost every element) as originally generated (including the enumeration list) and after the enumeration was removed.

  7. Code Listing 1: xmi.id before and after modification  

    Before modification
    <xs:attribute name="xmi.id" use="required">
    <xs:simpleType>
    <xs:restriction base="xs:NMTOKEN">
    <xs:enumeration value="XX.9.1054.33.1282"/>
    <xs:enumeration value="XX.9.1054.33.1283"/>
    <xs:enumeration value="XX.9.1054.33.1284"/>
    <xs:enumeration value="XX.9.1054.33.1285"/>
    ...
    </xs:restriction>
    </xs:simpleType>
    </xs:attribute>
    
    After modification
    <xs:attribute name="xmi.id" use="required" type="xs:string"/>
    


    Now the modified model.xsd document is ready for registration with Oracle XML DB. Becoming familiar with the document's contents is a good idea, so you can create view definitions based on it later in this procedure.

Step 3: Create the XML DB Repository Directory Structure

Before registering the XSD document, create some directory structures in the XML DB repository. In this example, we will create folders in a /public/EuroControl hierarchy to hold the XSD and XML files.

We use the code in Listing 2 to create the /EuroControl folder and the /xsd and /xml subfolders.

Code Listing 2: Creating the /EuroControl folders and /xsd and /xml subfolders 

declare
        ignore boolean;
begin
        ignore := dbms_xdb.createFolder('/public/EuroControl');
        ignore := dbms_xdb.createFolder('/public/EuroControl/xsd');
        ignore := dbms_xdb.createFolder('/public/EuroControl/xml');
        commit ;
end;
/


These folders can now be used to upload our Eurocontrol-specific XSD and/or XML documents.

You can use any of the following protocols to upload the XSD and XML documents: FTP, HTTP, WebDAV, or SQL*Net (using PL/SQL). We use FTP, because this allows us to explicitly visualize any Oracle Database errors that may occur during the upload: 

ftp> open <machine-name or ip-address> 2100
ftp> user ec
ftp> cd /public/EuroControl/xsd
ftp> put model.xsd
ftp> bye


The model.xsd document is now uploaded and ready for registration in Oracle XML DB.

Step 4: Register the XSD

Registering an XSD document in Oracle XML DB will result in the generation of an object-relational structure that is identical to the XML document contents described by the XSD.

Register an XSD document in Oracle XML DB with a call to DBMS_XMLSCHEMA.REGISTERURI , as shown in Listing 3.

Code Listing 3: Registering the XSD file with DBMS_XMLSCHEMA.REGISTERURI 

begin
   dbms_xmlschema.registerURI
     (schemaURL         => 'http://xmlns.oracle.com/xdb/model.xsd'
     ,schemaDocURI      => '/public/EuroControl/xsd/model.xsd'
     ,genTables         => true
     ) ;
end;


This call to DBMS_XMLSCHEMA.REGISTERURI takes three important arguments:

schemaURL: This URL points to the XSD that will be used to parse the XML document. This is the same URL that must be used in the model.xml file that will be uploaded into Oracle XML DB. By convention, the URL used is http://xmlns.oracle.com.xdb/< name-of-xsd >; in this case, the < name-of-xsd > is model.xsd.

schemaDocURI: This URI points to the physical location of the XSD, so that Oracle Database can parse it, and if needed, generate the required object-relational structure.

genTables: Set this Boolean to TRUE to generate the physical object-relational structure.

Step 5: Upload the XML Version of the Model (model.xml) into Oracle XML DB

At this point, we are ready to upload the model.xml document. An uploaded XML document can be stored in Oracle XML DB in two ways: 

  • If the document does not correspond to a registered XSD for which Oracle Database has created an object-relational structure, it will be stored in the internal Oracle XML DB repository (this is the default behavior).

  • If the document corresponds to a registered XSD, it will be stored in the object-relational structure generated during registration. This is our case.

We have already registered the XSD on which the model.xml document is based, so when we upload the document, Oracle Database will automatically break it up and store every component in its corresponding Oracle type.

We will again use FTP for the upload, to catch any potential server errors: 

ftp> open <machine-name or ip-address> 2100
ftp> user ec
ftp> cd /public/EuroControl/xml
ftp> put model.xml
ftp> bye


At this point, the entire UML model as contained in the model.xml document should be stored in the previously defined main table: MODEL_NAME .

Use the following query to test this: 

select count(*) from MODEL_NAME;


It should return the value 1 . If this is not the case, the model.xml document we uploaded did not correspond to any registered XSD for which an object-relational structure had been generated.

Step 6: Define Atomic Views

At this point, the contents of the XML document have been uploaded, broken down, and distributed into the object-relational structure. In a sense, the software system knowledgebase now exists, so the creation of database views is not absolutely necessary.

On the other hand, directly querying the object-relational structure implies searching through the XML document structure. Ideally, we would like to query in terms of the abstract UML concepts, such as packages, classes, and associations, and not have to worry about the structure of the XML document. The use of views allows us to provide an abstraction layer and effectively encapsulate the XML structure behind the well-known UML modeling concepts.

We have chosen to create atomic views—queries with no WHERE clause—to provide the basic building blocks required for the construction of more-complex queries. This gives end users great flexibility but also puts the burden of join design on them.

We have also chosen to use materialized views. A materialized view stores the result of the view-query at creation time. This gives obvious performance advantages but requires that the view be refreshed if the database changes. In our case, the database changes only when we upload a new version of the model, such as with a new software release, so the performance advantages of materialized views win out over the need to refresh the view.

Creating atomic view queries depends on the structure of the XSD used. Listing 4 contains an example of the atomic view query used to obtain all information about packages in one of our models.

Code Listing 4: Creating a materialized view by using an atomic query 

create materialized view ec_packages
build immediate
refresh complete on demand
as
select  extractValue(value(p),  '/UML_Package/@xmi.id')          id
,       extractValue(value(p),  '/UML_Package/@xmi.uuid')        uuid
,       extractValue(value(p),  '/UML_Package/@name')            name
,       extractValue(value(p),  '/UML_Package/@visibility')      visibility
,       extractValue(value(p),  '/UML_Package/@isSpecification') isSpecification
,       extractValue(value(p),  '/UML_Package/@isRoot')          isRoot
,       extractValue(value(p),  '/UML_Package/@isLeaf')          isLeaf
,       extractValue(value(p),  '/UML_Package/@isAbstract')      isAbstract
,       extractValue(value(p),  '/UML_Package/@namespace')       namespace
from MODEL_NAME c
, table(xmlsequence(extract(value (c),
'/XMI/XMI.content/UML_Model/UML_Namespace.ownedElement/descendant::UML_Package'
))) p
/


The atomic views required depend on the type of querying that needs to be done. Some suggestions for views are 

  • Classes (and association classes)

  • Class attributes (and association-class attributes)

  • Class operations

  • Class operation parameters

  • Datatypes

  • Descriptions

  • Diagrams

  • Interfaces

  • Interface operations

  • Packages

  • Stereotypes

Step 7: Query the Knowledgebase via SQL

At last, we have reached the raison d'être of this entire procedure. It's time to get some information out of the database. First, let's look at Listing 5—a simple query that retrieves the names of all classes and the names of the package in which they are located.

Code Listing 5: Simple query of classes and their packages from the knowledgebase 

select  c.name          class_name
,       p.name          package_name
from    ec_classes      c               -- view to retrieve all class data
,       ec_packages     p               -- view to retrieve all package data
wherec.namespace        = p.id(+)       -- join classes with packages
/


Who would suspect that behind this query lies a complex mechanism of XPath functions on an object-relational structure?

Listing 6 contains a more interesting query. In our models, we use diagrams to give an overview of the subsystem interfaces. We would like to generate a dictionary of interface types. To do this, we need to find all the classes that are used as types for parameters of operations belonging to classes on certain diagrams. The query in Listing 6 is a slightly simplified version of our query that returns the results for all diagrams.

Code Listing 6: Complex query of the modeling objects and their relationships 

/* Get all info about classes */
select         dia.name                 diagram_name
,              c.name                   class_name
,              o.operation_name         operation_name
,              op.param_name            param_name
,              d.name                   data_type
/* Substr is required to get data as varchar2 to be compliant with
second part of the query (union part). "Select null from x" is not
compatible with a clob type. Therefore, the clob must be converted
to a varchar2.
*/
,       dbms_lob.substr(de.description                     -- pointer
,                       dbms_lob.getlength(de.description) -- amount
,                       1                                  -- offset
                        ) data_type_desc
from            ec_diagrams                     dia     -- diagrams
,               ec_classes                      c       -- classes
args
,               ec_classes                      c2      -- classes used as types to args
,               ec_classes_operations           o       -- classes operations
,               ec_classes_operations_par       op      -- args of operations
,               ec_datatypes                    d       -- all data types
,               ec_descriptions                 de      -- descriptions of classes
where           dia.element_subject     = c.id
and             c.id                    = o.class_id (+)
and             o.operation_id          = op.operation_id (+)
and             op.param_type           = d.id (+)
and             d.name                  = c2.name (+)
and             c2.id                   = de.modelelement (+)
and             de.tag               (+)='documentation'
/* Get all info about associated classes as well */
UNION ALL
select         dia.name         diagram_name
,              c.name           class_name
,              null             operation_name
,              null             param_name
,              null             data_type
,              null             data_type_desc
from           ec_diagrams      dia
,              ec_as_classes    c
where          dia.element_subject = c.id
order by 1, 2, 3
/


Conclusion

Next Steps


LEARN more about
UML
XML
Oracle XML DB
Oracle XML DB Developer's Guide
DBMS_XMLSCHEMA
Oracle Database PL/SQL Packages and Types Reference

It isn't easy to get information out of UML models by using current modeling environments. Querying model data is nevertheless possible by means of the transformation we have described. We have seen that it is quite easy to pipe together current industry standards to transform a UML model into a database of software system knowledge in Oracle Database 10g, and you, too, can do this without any programming and with nearly no work. Yet the long-term return on investment is very high indeed.

We must thank the OMG and the W3C for their work on the underlying standards that make this possible. We also thank the designers of Oracle XML DB for creating the infrastructure. 


Robert Wigetman (robert.wigetman@eurocontrol.int) is responsible for software development project support at Eurocontrol-CFMU (www.eurocontrol.int), the Brussels-based European organization for the safety of air navigation. Jurgen Moortgat (jurgen.moortgat@oracle.com) is a software consultant at Oracle Consulting Services of Belgium (www.oracle.be).


Send us your comments