Know Your UML with XML
By Robert Wigetman and Jurgen Moortgat
Learn how Oracle XML DB can increase ROI on UML models of software systems.
Software technology has evolved to the point where big systems have become so complex that they are no longer simply specified in human language and programmed in computer language. Indeed, using UML (Unified Modeling Language) to model new software has become a de facto standard approach to building big systems.
The basic outputs of a software system model are code/documentation artifacts containing a partial view of the knowledge in the system. Today's modeling tools perform quite well at code generation and reverse engineering, but they are weak in their introspective capabilities, supporting only very simple query facilities that depend entirely on the features of the specific modeling tool at hand. At Eurocontrol-CFMU we have been discouraged by these limitations. Indeed, when we have devoted great effort to modeling, it has been difficult, if not impossible, to perform anything but the simplest queries on the contents of the models.
Suppose we want to audit our system and find out what datatypes are used in the system's public interfaces. We would like the model to respond to a query of the form "For each class with stereotype 'interface', in a package whose name is of the form '*public*', display the list of all datatypes of the parameters for all operations." Furthermore, we would like to use a standard query language such as SQL, based on the UML model's conceptual elements, and thereby save our staff from having to learn a UML tool's proprietary scripting language.
While developing UML models, we realized that the limits of the modeling tool chain were holding us back. We kept coming back to the query question and found ourselves saying, "If only we had a database!" This is how we began to explore the transformation of UML models into software system databases. Our business is to develop high-availability air-traffic-flow and capacity management systems; we are not a big software house and cannot devote serious development effort to the elaboration and maintenance of a modeling tool chain.
This article describes our transformation solution: how we used Oracle Database 10g to turn a UML model into a database of software system knowledge and thus made the information readily accessible. We provide a step-by-step procedure to allow readers to apply this same approach in their environment and thus increase the benefit of their UML models.
The steps in transforming our UML model into a software system knowledgebase constitute a remarkable procedure requiring no programming. It involves loading a UML model into Oracle XML DB and thereby provides full SQL query facilities—in other words, a genuine software system knowledgebase.
We must stress that no software development is required to support this transformation. Indeed, all the tooling required is readily available in the modeling tools themselves and in commercial off-the-shelf (COTS) XML editors (such as XMLSpy), and the core of the knowledgebase is Oracle Database 10g's object-relational kernel. Basically, anyone who makes UML models and runs Oracle Database 10g can apply the same transformation process.
Figure 1 presents the overall architecture of the transformation process as well as the data flow involved in the production of reports and other potential outputs of the software system knowledgebase.
This article presents the process we developed to transform a UML model into an Oracle XML DB-based software system knowledgebase. This process comprises the following steps:
Examples in this article are taken from our model files, which may not correspond exactly to those generated by a different modeling toolset.
Step 1: Extract an XML Version of the UML Model
In this step, the UML modeling tool creates a single XML document containing the contents of the UML model. This document adheres to the XML Metadata Interchange (XMI) standard. At the time we were working on this project, only the DTD-based version, 1.2, of the XMI standard was available. Since then the Object Management Group (OMG) has released version 2.0, which is XSD-based. Having a modeling tool that supports the new schema-based XMI 2.0 standard is advantageous, but it is not necessary. Any XML version of the UML model is sufficient for the rest of the transformation process.
The XML document provided by the UML tool is the input to the next step in the process. Let's name the document model.xml.
Step 2: Generate the XML Schema Definition
Before you upload the model.xml file into Oracle XML DB, you must register the file structure. To do this, provide Oracle XML DB with a schema definition file, which describes, in a useful manner, the structure of the XML document.
We were not able to find a useful schema definition corresponding to the XMI 1.2 standard, so we generated a schema definition from the XML file itself. There are many COTS tools that can do this—each with its own set of minor issues. The tool we used for this operation generated XSD that contained the colon (:) character embedded in element names. Because this is invalid, we globally replaced the embedded colons with underscores to obtain a well-formed XSD.
After you create and clean up—if necessary—the model.xsd file, perform the following operations:
Step 3: Create the XML DB Repository Directory Structure
Before registering the XSD document, create some directory structures in the XML DB repository. In this example, we will create folders in a /public/EuroControl hierarchy to hold the XSD and XML files.
We use the code in Listing 2 to create the /EuroControl folder and the /xsd and /xml subfolders.
Code Listing 2: Creating the /EuroControl folders and /xsd and /xml subfolders
declare ignore boolean; begin ignore := dbms_xdb.createFolder('/public/EuroControl'); ignore := dbms_xdb.createFolder('/public/EuroControl/xsd'); ignore := dbms_xdb.createFolder('/public/EuroControl/xml'); commit ; end; /
These folders can now be used to upload our Eurocontrol-specific XSD and/or XML documents.
You can use any of the following protocols to upload the XSD and XML documents: FTP, HTTP, WebDAV, or SQL*Net (using PL/SQL). We use FTP, because this allows us to explicitly visualize any Oracle Database errors that may occur during the upload:
ftp> open <machine-name or ip-address> 2100 ftp> user ec ftp> cd /public/EuroControl/xsd ftp> put model.xsd ftp> bye
The model.xsd document is now uploaded and ready for registration in Oracle XML DB.
Step 4: Register the XSD
Registering an XSD document in Oracle XML DB will result in the generation of an object-relational structure that is identical to the XML document contents described by the XSD.
Register an XSD document in Oracle XML DB with a call to DBMS_XMLSCHEMA.REGISTERURI , as shown in Listing 3.
Code Listing 3: Registering the XSD file with DBMS_XMLSCHEMA.REGISTERURI
begin dbms_xmlschema.registerURI (schemaURL => 'http://xmlns.oracle.com/xdb/model.xsd' ,schemaDocURI => '/public/EuroControl/xsd/model.xsd' ,genTables => true ) ; end;
This call to DBMS_XMLSCHEMA.REGISTERURI takes three important arguments:
schemaURL: This URL points to the XSD that will be used to parse the XML document. This is the same URL that must be used in the model.xml file that will be uploaded into Oracle XML DB. By convention, the URL used is http://xmlns.oracle.com.xdb/< name-of-xsd >; in this case, the < name-of-xsd > is model.xsd.
schemaDocURI: This URI points to the physical location of the XSD, so that Oracle Database can parse it, and if needed, generate the required object-relational structure.
genTables: Set this Boolean to TRUE to generate the physical object-relational structure.
Step 5: Upload the XML Version of the Model (model.xml) into Oracle XML DB
At this point, we are ready to upload the model.xml document. An uploaded XML document can be stored in Oracle XML DB in two ways:
We have already registered the XSD on which the model.xml document is based, so when we upload the document, Oracle Database will automatically break it up and store every component in its corresponding Oracle type.
We will again use FTP for the upload, to catch any potential server errors:
ftp> open <machine-name or ip-address> 2100 ftp> user ec ftp> cd /public/EuroControl/xml ftp> put model.xml ftp> bye
At this point, the entire UML model as contained in the model.xml document should be stored in the previously defined main table: MODEL_NAME .
Use the following query to test this:
select count(*) from MODEL_NAME;
It should return the value 1 . If this is not the case, the model.xml document we uploaded did not correspond to any registered XSD for which an object-relational structure had been generated.
Step 6: Define Atomic Views
At this point, the contents of the XML document have been uploaded, broken down, and distributed into the object-relational structure. In a sense, the software system knowledgebase now exists, so the creation of database views is not absolutely necessary.
On the other hand, directly querying the object-relational structure implies searching through the XML document structure. Ideally, we would like to query in terms of the abstract UML concepts, such as packages, classes, and associations, and not have to worry about the structure of the XML document. The use of views allows us to provide an abstraction layer and effectively encapsulate the XML structure behind the well-known UML modeling concepts.
We have chosen to create atomic views—queries with no WHERE clause—to provide the basic building blocks required for the construction of more-complex queries. This gives end users great flexibility but also puts the burden of join design on them.
We have also chosen to use materialized views. A materialized view stores the result of the view-query at creation time. This gives obvious performance advantages but requires that the view be refreshed if the database changes. In our case, the database changes only when we upload a new version of the model, such as with a new software release, so the performance advantages of materialized views win out over the need to refresh the view.
Creating atomic view queries depends on the structure of the XSD used. Listing 4 contains an example of the atomic view query used to obtain all information about packages in one of our models.
Code Listing 4: Creating a materialized view by using an atomic query
create materialized view ec_packages build immediate refresh complete on demand as select extractValue(value(p), '/UML_Packagefirstname.lastname@example.org') id , extractValue(value(p), '/UML_Packageemail@example.com') uuid , extractValue(value(p), '/UML_Package/@name') name , extractValue(value(p), '/UML_Package/@visibility') visibility , extractValue(value(p), '/UML_Package/@isSpecification') isSpecification , extractValue(value(p), '/UML_Package/@isRoot') isRoot , extractValue(value(p), '/UML_Package/@isLeaf') isLeaf , extractValue(value(p), '/UML_Package/@isAbstract') isAbstract , extractValue(value(p), '/UML_Package/@namespace') namespace from MODEL_NAME c , table(xmlsequence(extract(value (c), '/XMI/XMI.content/UML_Model/UML_Namespace.ownedElement/descendant::UML_Package' ))) p /
The atomic views required depend on the type of querying that needs to be done. Some suggestions for views are
Step 7: Query the Knowledgebase via SQL
At last, we have reached the raison d'être of this entire procedure. It's time to get some information out of the database. First, let's look at Listing 5—a simple query that retrieves the names of all classes and the names of the package in which they are located.
Code Listing 5: Simple query of classes and their packages from the knowledgebase
select c.name class_name , p.name package_name from ec_classes c -- view to retrieve all class data , ec_packages p -- view to retrieve all package data wherec.namespace = p.id(+) -- join classes with packages /
Who would suspect that behind this query lies a complex mechanism of XPath functions on an object-relational structure?
Listing 6 contains a more interesting query. In our models, we use diagrams to give an overview of the subsystem interfaces. We would like to generate a dictionary of interface types. To do this, we need to find all the classes that are used as types for parameters of operations belonging to classes on certain diagrams. The query in Listing 6 is a slightly simplified version of our query that returns the results for all diagrams.
Code Listing 6: Complex query of the modeling objects and their relationships
/* Get all info about classes */ select dia.name diagram_name , c.name class_name , o.operation_name operation_name , op.param_name param_name , d.name data_type /* Substr is required to get data as varchar2 to be compliant with second part of the query (union part). "Select null from x" is not compatible with a clob type. Therefore, the clob must be converted to a varchar2. */ , dbms_lob.substr(de.description -- pointer , dbms_lob.getlength(de.description) -- amount , 1 -- offset ) data_type_desc from ec_diagrams dia -- diagrams , ec_classes c -- classes args , ec_classes c2 -- classes used as types to args , ec_classes_operations o -- classes operations , ec_classes_operations_par op -- args of operations , ec_datatypes d -- all data types , ec_descriptions de -- descriptions of classes where dia.element_subject = c.id and c.id = o.class_id (+) and o.operation_id = op.operation_id (+) and op.param_type = d.id (+) and d.name = c2.name (+) and c2.id = de.modelelement (+) and de.tag (+)='documentation' /* Get all info about associated classes as well */ UNION ALL select dia.name diagram_name , c.name class_name , null operation_name , null param_name , null data_type , null data_type_desc from ec_diagrams dia , ec_as_classes c where dia.element_subject = c.id order by 1, 2, 3 /
It isn't easy to get information out of UML models by using current modeling environments. Querying model data is nevertheless possible by means of the transformation we have described. We have seen that it is quite easy to pipe together current industry standards to transform a UML model into a database of software system knowledge in Oracle Database 10g, and you, too, can do this without any programming and with nearly no work. Yet the long-term return on investment is very high indeed.
We must thank the OMG and the W3C for their work on the underlying standards that make this possible. We also thank the designers of Oracle XML DB for creating the infrastructure.
Robert Wigetman (firstname.lastname@example.org) is responsible for software development project support at Eurocontrol-CFMU (www.eurocontrol.int), the Brussels-based European organization for the safety of air navigation. Jurgen Moortgat (email@example.com) is a software consultant at Oracle Consulting Services of Belgium (www.oracle.be).