|
XML
Integrating Portable XML Data with Portable C/C++ Code
by Zhaoqing Wang & Harry H. Cheng
An embeddable C/C++ interpreter may be the answer for overcoming the limitations presented by writing XML-based applications with nonportable C/C++ code.
XML is changing the world of information sharing and exchange. The XML standard allows users to clearly define their own data and documents in an open, platform-, vendor-, and language-neutral manner for tasks such as electronic data interchange, data management, and publishing. Its self-describable flexible tags that mark the start and end of a related data block construct a hierarchy of related data objects called elementswhich may be a database, pieces of a Web page (links, numbers, metadata, text and images), or contents of a spreadsheet. This structure makes data reusable, easy derivable, and reconfigurable because XML separates content and presentation with context encapsulation.
Because XML is a well-formed markup language, a programming technology is required in order to perform processing-related tasks such as parsing, generating, manipulating, and validating XML data.
For that reason, C and C++ are commonly used for writing XML-based applications. However, either of those languages presents challengesalthough C/C++ code is theoretically portable, the associated compilation and linking processes are not. Hence, most C/C++ code needs to be compiled and linked by different means on different platforms. C/C++ code that is truly portable is not generated and executed dynamically; therefore, in those situations Java and a Java Virtual Machine are commonly used for processing XML data.
In this article, we will describe and demonstrate the Ch language environment (download) and Ch Package for Oracle C/C++ XML Developer's Kit (XDK) (download), an embeddable C/C++ interpreter for cross-platform scripting, shell programming, numerical computing, network computing, and embedded scripting. The Ch XML Package is designed to integrate portable XML data and portable C/C++ code; it allows a large body of existing applications and technologies based on C/C++ to work seamlessly with XML documents.
Characteristics
The C-compatible Ch language environment is an embeddable interpreter for cross-platform C/C++ scripting. Ch supports all features in the ISO 1990 C standard and most new features added in ISO C99 such as complex numbers, variable-length array (VLA), binary constants, and IEEE 754 floating-point arithmetic. The language also supports classes, objects, and encapsulation in C++ for object-based programming. Major characteristics include:
- Interpretive: C programs can be executed in Ch without tedious compile/link/execute/debug cycles.
- Interactive: One can run the C code interactively, entering the code line by line. Thus, Ch can be easily used to test new functions. It is also a good environment for real-time interactive computing.
- Embeddable: Ch can be embedded in other application programs, hardware, and handheld devices, relieving users from developing and maintaining proprietary scripting languages across different platforms.
- Numerical computing: In addition to supporting all C types such as char, int, float, double, and the new type complex and VLA, Ch treats a computational array as a first-class object. Many high-level numerical functions, such as differential equation solving, integration, and Fourier analysis along with 2D/3D plotting make Ch a powerful language environment for solving problems in engineering and science.
- Very high level: Ch bridges the gap between low-level languages and very high-level languages (VHLLs). As a superset of C, Ch retains low-level language features. But as a VHLL, it makes programs easily portable among different platforms.
- Object-based: Ch supports classes, objects, and encapsulation in C++ for object-based programming with data abstraction and information hiding, as well as simplified I/O handling.
- Text handling: Ch has advanced text-handling features such as built-in string data type and foreach-loop. These features are especially useful for system administration, shell programming, and web-based applications. This feature is also useful for development of portable code to handle portable data.
- Cross-platform shell: Ch provides a universal shell for the convenience of users. It can be used as a login command shell similar to C-shell, Bourne shell, Bash, or Korn Shell in Unix, as well as MS-DOS shell in Windows.
- Safe network computing: Safe Ch is designed from scratch with different secure layers, such as sandbox, programmer/administrative control, suppressed pointers, restricted functions, automatic memory management for string type, and auto array bound checking, effectively addressing security problems for network computing.
- Portable: A Ch program can run across different platforms including Windows and Unix. A programmer can develop and maintain programs on one machine and then deploy them to all platforms supported by Ch.
- Broad choice of libraries: All existing C libraries and modules can be part of the Ch libraries. Therefore, the potential of Ch libraries is almost unlimited. For example, Ch supports POSIX, TCP/IP socket, Winsock, Win32, X11/Motif, GTK+, OpenGL, ODBC, LAPACK, LDAP, NAG statistics library, Intel OpenCV for computer vision and image processing, National Instrument's NI-DAQ and NI-Motion, and so on.
- Web-enabled: With development modules such as classes for Common Gateway Interface (CGI) for Web servers, Ch allows rapid development and deployment of Web-based applications and services.
Structure
Contents of the Ch XML Package for Oracle XDK (available for Windows) include:
- choraxml/demosoriginal Oracle XDK demos in C/C++, ready to run in Ch
- choraxml/dlCh dynamically loaded library
- choraxml/libCh functions
- choraxml/includeCh header files
- choraxml/binOracle XDK dynamic library and commands
- choraxml/srcSource code to develop Ch XML Package.
To install Ch XML for Oracle XDK, follow the steps below.
- Ch is required to run Ch XML. If Ch has not been installed in your computer, download and install Ch from http://www.softintegration.com.
- Start Ch.
- Download the file choraxml_v1.0.0.Window.tar.gz from http://iel.ucdavis.edu/projects/chxml/. Use the following command to unzip and untar the zip file in the Ch command shell.
gzip -cd choraxml_v1.0.0.Window.tar.gz | tar -xvf -
- Follow the instructions in the Readme file to install the choraxml package and set the environment variables ORA_NLS33 and ORA_XML_MESG, which are required for Oracle XDK.
- Go to the directory of demos, and type a file name such as DOMNamespace.c to run the program interpretively.
Integration of Oracle XDK with Ch
One of Ch's most important features is that binary static or dynamic C/C++ libraries can be easily integrated with the language environment without recompilation. For example, with the Ch XML Package for Oracle C/C++ XDK, a C/C++ application using the Oracle XML library can be executed interpretively across platforms. They can also be executed through the Internet.
Integration with DOM API
The XML Document Object Model (DOM) API creates a tree structure in memory to store the XML document's data. Typically, a DOM-based XML C/C++ application has no callback functions. However, it is relatively simple easy to create Ch binding to the Oracle XDK's DOM API.
Figure 1 illustrates how the Oracle XDK integrates with Ch. The architecture comprises three layers: The top one is the user's applications, which are the existing applications using the C/C++ XML library. The middle layer is the Ch wrapper, which is the middleware between binary functions and text-based interpretive functions. We developed the Ch wrapper or Ch binding for Oracle XDK; the source code for developing the Ch wrapper is also available in the distribution. This open source can be used to port the Ch binding to different platforms and different versions of the XDK. The bottom layer is the original C/C++ binary provided by the XDK.
Figure 1: Architecture for integration of DOM with Ch
With the Ch binding for Oracle XDK, the user's applications can be executed interpretively without compilation across platforms: When an XML application calls an XML function, the program calls the XML-Ch function. The XML-Ch functions typically call the XML Ch Dynamically Loaded Library (CHDL) function, which in turn calls the corresponding binary function in the Oracle XDK. For example, in application program of DOMNamespace.c, xmlparse() function is called. When the application DOMNamespace.c is launched, the xmlparse.chf is used. xmlparse.chf calls binary function xmlparse_chdl() through the dynamically loaded library, which calls binary function xmlparse(). The sample code of xmlparse.chf is listed below.
uword xmlparse(xmlctx *ctx, oratext *uri, oratext *incoding, ub4 flags) {
void *fptr;
uword retval;
fptr = dlsym(_ChOraxml_handle, "xmlparse_chdl");
if(fptr == NULL) {
fprintf(_stderr, "Error: %s(): dlsym(): %s\n", __func__, dlerror());
return -1;
}
dlrunfun(fptr, &retval, xmlparse, ctx, uri, incoding, flags);
return retval;
}
where _ChOraxml_handle is the file handle of XML Ch Dynamically Loaded Library. Function dlsym() obtains the address of function xmlparse_chdl() in Dynamically Loaded Library. Function dlrunfun() calls the xmlparse_chdl() by its address.
The source code of function xmlparse_chdl() inside XML Ch Dynamically Loaded Library is as follows.
EXPORTCH uword xmlparse_chdl(void *varg) {
va_list ap;
xmlctx *ctx;
const oratext *uri;
const oratext *incoding;
ub4 flags;
uword retval;
//get all the arguments passing to the binary function
Ch_VaStart(ap, varg);
ctx = Ch_VaArg(ap, xmlctx *);
uri = Ch_VaArg(ap, const oratext *);
incoding = Ch_VaArg(ap, const oratext *);
flags = Ch_VaArg(ap, ub4);
//call the binary xmlparse() function
retval = xmlparse(ctx, uri, incoding, flags);
Ch_VaEnd(ap);
return retval;
}
Function xmlparse_chdl() obtains the values for its arguments from the function xmlparse() in user's application layer through function Ch_VaArg(). It then calls the binary function xmlparse() and returns its value.
Integration with SAX API with Callback Functions
The Simple API for XML (SAX) uses an event-based model to process XML document; a SAX-based parser invokes methods in C++ or functions in C when markup (such as a start tag or end tag) is encountered. The SAX application defines the callback functions for the XML document. The Ch wrapper provides the registration for this kind of callback function (shown in Figure 2). When an application calls a function with an argument of point to function (callback function) defined in the user space, the application passes the address of callback function to the Ch function as an argument of the function. However, the callback function should be registered in the Ch wrapper. The Ch function passes this address to the CHDL function to register this callback function in the Ch wrapper. When an event is encountered, this callback function will be eventually called by the XML binary library function.
Figure 2: Architecture for integration of SAX with callback functions with Ch
For example, SAX API includes function xslsetoutputsax(xslctx *xslSSctx, xmlsaxcb *s). The second argument of type "xmlsaxcb *", pointer to structure, has member fields of pointer to functions. The function file xslsetoutputsax.chf is similar to xmlparse.chf in the previous section. The source code for function xslsetoutputsax_chdl() is listed below.
......
static xmlsaxcb saxcb_ch;
....
static sword startDocument_chdl_funarg(void *ctx);
static void * startDocument_chdl_funptr = NULL;
....
EXPORTCH uword xslsetoutputsax_chdl(void *varg) {
va_list ap;
xslctx *xslSSctx;
xmlsaxcb *saxcb;
xmlsaxcb *saxcb_tmp;
uword retval;
....
saxcb = Ch_VaArg(ap, xmlsaxcb *);
....
if(saxcb != NULL)
{
....
startDocument_chdl_funptr = (void *)saxcb->startDocument;
if(saxcb->startDocument != NULL)
saxcb_ch.startDocument = startDocument_chdl_funarg;
....
}
saxcb_tmp = &saxcb_ch;
retval = xslsetoutputsax(xslSSctx, saxcb_tmp);
Ch_VaEnd(ap);
return retval;
}
static sword startDocument_chdl_funarg(void *ctx){
sword retval = 0;
Ch_CallFuncByAddr(NULL, startDocument_chdl_funptr, &retval, ctx);
return retval;
}
The function startDocument_chdl_funarg() is for registration of the callback function, which will be called when the event of "start a document" is encountered. The pointer to user defined callback function is obtained by startDocument_chdl_funptr through Ch_VaArg(). This callback function in user's application layer is invoked by function Ch_CallFuncByAddr().
Integration with Oracle Database Using ODBC
Applications using XML often need to access a database such as Microsoft Access or Oracle database. This task can be easily accomplished using the Ch ODBC Toolkit (shown in Figure 3). An application can call APIs of XML for document handling and OBDC for database access directly and interpretively.
Figure 3: Integration with database
For example, the Ch code below connects to a database using ODBC. It inserts and deletes records.
/************************************************************************
* This is an example for using ODBC API in Ch.
* Before running this example, you have to
* create a database in Microsoft Access first and
* then register the database
* as "test.mdb" in Windows ODBC driver manager.
*************************************************************************/
#include <windows.h>
#include <sql.h>
#include <sqlext.h>
#define TEST_LEN 10
#define NAME_LEN 30
int main() {
HENV henv;
HDBC hdbc;
UCHAR szDSN[SQL_MAX_DSN_LENGTH+1];
UCHAR szDescription[255];
SWORD cbDSN;
SWORD cbDescription;
HSTMT hstmt;
RETCODE rc;
UCHAR szTest[TEST_LEN+1];
UCHAR ** szGetData;
SDWORD iNumRows = 0;
SDWORD cbTest = SQL_NTS;
SDWORD Age = 29;
UCHAR szName[NAME_LEN];
SDWORD sAge;
SDWORD cbName, cbAge;
SQLAllocEnv(&henv);
SQLAllocConnect(henv, &hdbc);
rc = SQLDataSources(henv, SQL_FETCH_FIRST, szDSN, sizeof(szDSN),
&cbDSN, szDescription, sizeof(szDescription),
&cbDescription);
printf(" %s \n", szDescription);
printf(" %s \n", szDSN);
// Connect to DataSource
rc = SQLConnect(hdbc, "test", 5, NULL, SQL_NTS, NULL, SQL_NTS);
if( rc != SQL_SUCCESS)
printf(" invalid handle, connect failed \n");
rc = SQLAllocStmt(hdbc, &hstmt);
if( rc != SQL_SUCCESS)
printf(" invalid handle, Allocate statement failed \n");
// insert a record
rc = SQLPrepare(hstmt, "INSERT INTO Info (Name, Age) VALUES (?, ?)",
SQL_NTS);
if( rc != SQL_SUCCESS)
printf(" invalid handle, insert failed \n");
// bind the data will be input
rc = SQLBindParameter(hstmt, 1, SQL_PARAM_INPUT, SQL_C_CHAR,
SQL_CHAR, TEST_LEN, 0, szTest, 0, &cbTest); // bind szTest to Name
rc = SQLBindParameter(hstmt, 2, SQL_PARAM_INPUT, SQL_C_USHORT,
SQL_NUMERIC, 10, 0, &Age, 0, &cbTest);
// bind varible Age to record Age
if( rc != SQL_SUCCESS)
printf(" invalid handle, bind failed \n");
strcpy(szTest, "You");
rc = SQLExecute(hstmt);
if( rc != SQL_SUCCESS)
printf(" invalid handle, Execute failed \n");
// connect statement to original database
rc = SQLAllocStmt(hdbc, &hstmt);
// delete record whose name is 'You'
rc = SQLPrepare(hstmt, "DELETE FROM Info WHERE Name = 'You'", SQL_NTS);
if( rc != SQL_SUCCESS)
printf(" invalid handle, delete failed \n");
rc = SQLExecute(hstmt);
if( rc != SQL_SUCCESS)
printf(" invalid handle, Execute failed \n");
SQLFreeStmt(hstmt, SQL_CLOSE);
SQLDisconnect(hdbc);
SQLFreeConnect(hdbc);
SQLFreeEnv(henv);
}
An Application Example
Oracle XDK includes several C/C++ demo programs for processing XML documents; using the Ch Package, all these C/C++ programs can be executed interpretively without compilation. For example, Figure 4 shows the interactive execution of C program DOMNamespace.c in a Ch command shell. When a file name is typed in a command shell, the program will be executed and output will be displayed as show in Figure 4. The program DOMNamespace.c can also be executed from an IDE for Ch. Several IDEs, for example UltraEdit and SlickEdit, are readily available for editing and running Ch programs.
Figure 4: Interactive execution of C program DOMNamespace.c.
If Ch is embedded as a scripting engine in an application program, the program will be able to process XML documents using C/C++ scripts dynamically controlled by the application program. In this case, program is treated as a C script.
XML documents are widely used for web-based application and integration. Like Perl, Python, or PHP, interpretive C/C++ scripts can be used in Ch for creating dynamic web pages. The Ch CGI Toolkit contains four classesRequest, Response, Server, and Cookiewith APIs similar to ActiveServer Pages and JavaServer Pages. For Web-based application, Ch CGI scripts typically have the file extension .ch. For example, program DOMNamespace.ch can be launched by clicking on a hyperlink from a web browser. It can then print out and display the result of the program inside the web browser. Program DOMNamespace.ch is modified based on program DOMNamespace.c with following changes:
#include <cgi.h>
.... //same as DOMNamespace.c
int main()
{
xmlctx *ctx;
class CResponse Response;
Response.setContentType( "text/plain" );
Response.begin();
... //same as DOMNamespace.c
...
Response.end();
return (ecode ? -1 : 0);
}
Class Response encapsulates HTTP style responses. Response::begin() begins to send output. CResponse::end() ends standard output.
Figure 5 shows a demo page for a web-based application using the Ch XML Package. The "Function" area contains links to the CGI demo programs; when a link such as "DOMNAmespace.ch" is clicked, the corresponding CGI program will be executed.
Figure 5: Ch XML Package for Oracle XDK demos Web page
The output is the same whether it is executed in a Ch command shell sown in Figure 4 or IDE, as shown in Figure 6.
Figure 6: Output from executing C program DOMNamespace.ch through the Web
Conclusion
Integration of portable XML data and portable C/C++ code has excellent potential for efficient document and data processing. Using the C/C++ interpreter Ch and Ch XML Package, applications using Oracle C/C++ XDK can be executed interpretively without tedious compile/link/execute/debug cycles. Furthermore, because Ch is an embeddable C/C++ interpreter, applications can embed Ch as a scripting engine to process XML documents using C/C++ scripts.
Zhaoqing Wang (zqwang@iel.ucdavis.edu) is a post doctor working at the Integration Engineering Laboratory, University of California, Davis. He specializes in network computing and open architecture
software integration.
Harry H. Cheng is a professor and director of the Integration Engineering Laboratory in the Department of Mechanical and Aeronautical Engineering at the University of California, Davis.
|