As Published In
Oracle Magazine
January/February 2008

FEATURE


Information Comes Home

By David Baum

Oracle Database 11g manages all your enterprise data.

Every business depends on the ability to capture, analyze, and share information efficiently. The need to establish unified, enterprisewide policies for managing this information is leading IT departments to Oracle Database 11g, which can manage all types of data—from traditional business information to office documents, XML, and both 2-D and 3-D spatial information.

Oracle Database 11g
Oracle Database 11g
Download pdf

These capabilities are making Oracle Database 11g a popular choice to power transaction processing, data warehousing, and content management applications. "Our objective with Oracle Database 11g is to provide a single platform to store, manage, back up, and secure all your data," says Vishu Krishnamurthy, Oracle's senior director of SQL, text, and XML development. "We're delivering a single repository for both data and metadata, much of which is currently stored in file systems and content management systems and on unmanaged PCs."

An Evolving Architecture

In addition to improvements in performance, scalability, availability, security, and management efficiency, Oracle Database 11g includes many new features that support all your enterprise data. For example, Oracle Database 11g's Spatial option provides native semantic technology, which identifies and creates connections between disparate pieces of data—such as a company's customers, orders, and products—creating new, actionable insights. These capabilities are of particular importance to Metatomix, a leading provider of semantic solutions in the government, financial services, manufacturing, and life sciences sectors.

"I'm not just looking for solutions to relational data storage but for a reliable repository for all types of information," says Colin Britton, chief technology officer at Metatomix. "With an Oracle database, we can manage our data, our configuration information, and all of our code in a single repository. We get all the benefits of backup, management, failover, clustering, and so forth, without supporting different strategies for different parts of the application."

Metatomix also favors Oracle Database 11g for its native Resource Description Framework (RDF) storage capabilities. "Oracle supplies a semantic layer within the database, and Oracle is the only database vendor that offers native support for RDF datatypes and OWL [Web ontology language] ontologies," adds Britton. "This makes it a good technical option for us and our customers."

For example, a U.K.-based financial services organization hired Metatomix to speed up data-intensive tasks related to the processing of mutual fund trades. The bank needed a business intelligence application that could understand the meaning of the information it was gathering, which led the company to semantic technology.

Britton says that the Oracle-based solution helps this financial services organization verify fund prices and identify data anomalies. "Financial services firms have massive amounts of data, governed by complex business processes and analytics," he says. "Semantic technology offers insight whenever exceptions occur, and Oracle has the right database platform for maintaining the information."

Reducing Performance Overhead

Organizations see the wisdom in consolidating information into a professionally managed database; however, there have been concerns about the performance overhead required to store and retrieve large objects (LOBs). To address the issue of LOB performance, Oracle Database 11g includes Oracle SecureFiles, a new and improved storage infrastructure for LOB datatypes.

Oracle partner TUSC has been working with a large government client testing Oracle SecureFiles. Based in Chicago, TUSC is a consulting firm that helps companies optimize their investments in Oracle technology and applications.

"Oracle SecureFiles enables the customer to manage unstructured data in the same way as structured data," says Richard J. Niemiec, CEO of TUSC and former president of the International Oracle User Group. "It gives the same security, backup, recovery, encryption, flashback technology, locking, clustering technology, partitioning. . . . All these mature database features can now be brought to bear on all types of data."

While some companies have had reservations about putting unstructured information into a relational database for performance reasons, Niemiec believes that the new functionality of Oracle Database 11g with Oracle SecureFiles can resolve these concerns. "Oracle SecureFiles uses entirely different technology for locking, space management, and disk storage than former versions of the database," he says. "Oracle SecureFiles is not only much faster than traditional LOBs, it's faster than a file system. This is one of the most compelling reasons to move to Oracle SecureFiles and Oracle Database 11g."

During their testing, TUSC found out that Oracle SecureFiles was 22 times faster than LOBs when reusing deleted space. Similarly, TUSC saw PL/SQL reads as 6 times faster, and SQL*Loader operation as nearly 3 times faster. "All of the performance issues that might have prevented people from storing unstructured content in the database have been addressed," Niemiec says.

A Remodeled Storehouse for XML

As a markup language capable of describing all kinds of information—including structured, unstructured, and semi-structured data—XML appeals to many types of organizations. Oracle was the first commercial relational database vendor to offer native XML capabilities within a relational database. "Oracle made XML a fundamental datatype with Oracle9i Database Release 2," says Oracle's Krishnamurthy.

Oracle Database's XML capabilities caught the ear of Don Janik, senior technical manager at Warner Music Group, home to well-known record labels such as Asylum, Atlantic, and Elektra. Last year, Warner Music Group implemented the XML DB feature in Oracle Database 10g to collect, validate, and track critical sales information from its business partners. The music company has also tested the new XML capabilities in Oracle Database 11g.

Warner Music Group looked at various solutions for handling XML data. They chose the Oracle solution because they like the way Oracle implements structured storage. "Based on the XML schema, it is not necessary for Oracle XML DB to store XML tag names when storing the contents of XML documents," Janik says. "This can significantly reduce the storage space required."

Snapshots



Metatomix

 Location: Boston, Massachusetts
 Industry: High technology
 Employees: 46
 Oracle products: Oracle Database, Oracle Application Server

Warner Music Group

 Location: New York City
 Industry: Entertainment
 Employees: 4,000 (2006)
 Oracle products: Oracle Database, Oracle XML DB, Oracle SQL*Loader

U.S. Army Corps of Engineers Engineer Research and Development Center/Cold Regions Research and Engineering Laboratory

 Location: Hanover, New Hampshire
 Industry: Government
 Employees: 2,000
 Oracle products: Oracle Database, Oracle Application Server, Oracle Portal, Oracle Web Services, Oracle Spatial

Oracle XML DB is commonly used for storing, retrieving, and managing massive volumes of XML data. XML documents can be stored either as character large objects for unstructured and semistructured documents such as Microsoft Word documents, or as object-relational datatypes for structured documents such as purchase orders.

This functionality enables XML documents to be accessed using industry standard SQL, XML, and file/folder interfaces. Moreover, says Oracle's Krishnamurthy, Oracle Database 11g introduces a new storage representation and indexing method for XML data: Binary XML datatype and XMLIndex. Binary XML improves performance and storage efficiency. In combination with XMLIndex, it can provide up to a 15-fold performance improvement when accessing XML documents.

Warner Music Group uses Oracle XML DB to expedite the loading and validation of digitally formatted sales information. In one test, Janik's team processed 50,000 sales transactions in 15 minutes. This is partly due to the unique architecture of Oracle Database. When an XML schema is registered, Oracle XML DB generates a set of SQL objects that correspond to complex types defined in that schema. XPath expressions, sent to Oracle XML DB functions, are translated to SQL access methods that operate directly against the underlying objects.

"By abstracting the storage model through the use of the XMLType datatype, and providing a set of operators that use XPath to perform operations against XML documents, Oracle XML DB lets us switch between structured and unstructured storage and to experiment with different forms of structured storage without affecting the application," Janik says.

This is particularly important as the music industry prepares for compliance with Recording Industry Association of America (RIAA) standards for electronic messages. The RIAA is proposing use of the Digital Data Exchange XML-based message formats for processing payments and royalties among music service providers, artists, publishers, and composers. Because of the innovative work being done by Janik and his team, Warner Music Group will be well positioned to take advantage of these new standards.

Integrating Spatial Data

Scientists working for the U.S. Army Corps of Engineers rely on Oracle to store all kinds of data, including geospatial and multimedia information.

"Oracle Database 11g helps us pull together and process very large sets of both unstructured and traditional structured data," says Michael Smith, a physical scientist who works at the Cold Regions Research and Engineering Lab of the U.S. Army Corps of Engineer's Engineer Research and Development Center.

Smith says that one application in particular that can benefit from the new features of Oracle Database 11g is the Operation and Maintenance Business Information Link Regulatory Module (ORM2), a Web-based system for issuing and tracking development projects in and around public wetlands. Anyone who wants to develop on waters of the United States is required to apply for a wetland permit. The Corps uses ORM2 to track the development and its associated impact on the landscape.

According to Smith, this permit data used to be stored in database columns as x-y coordinate pairs. The Corps recently moved it to Oracle Spatial to gain point, line, and polygon storage capabilities. "Spatial storage maintains the spatial indexes, including the type of geometry, the projections, the tolerances, and the whole series of coordinate pairs," he says. "This lets us do a lot more work inside the database."

Next Steps



READ more about
Oracle Database
Oracle Spatial
Oracle XML DB
Oracle Information Lifecycle Management

 DOWNLOAD Oracle Database 11g

The Corps makes this data available through Web Feature Services (WFS), making it easier to enter data and share it with other government agencies. Oracle Database 11g can dispatch those Web services directly from the database, thus simplifying the infrastructure.

"Being able to store the spatial indexes directly in the database, at the transactional level, enables us to edit the polygons through the WFS transactions," says Smith. "It's a standardized format, so we don't have to develop and maintain as much code."

The center is also developing database applications to work with 3-D point cloud storage and indexing functions, which can now be stored directly in Oracle Database 11g. (A point cloud is a set of three-dimensional points describing the outlines or surface features of an object.) The laboratory uses this type of data structure for recording measurements that come from its light detection and ranging instruments.

"Up until now, such data had to be stored in flat text files, because databases had no way of tying together the data sets so that we could run calculations against them as a single entity," says Smith. "Oracle Spatial 11g can store a point cloud as a single object. This means that we can write simple queries to do line-of-sight, data-point-intensity, or nearest-neighbor calculations without even moving the data out of the database."

Moving forward, the agency plans to use Oracle Database 11g to merge business and technical data into a common repository. Like thousands of other Oracle customers, Smith and his team have learned an important lesson: To simplify management and gain meaningful insight into all your information, you need to be able to store, manage, and analyze all types of data in a cohesive way.

 

Providing Better Data Management



Until recently, data storage was managed by individual company policy. However, new government regulations and guidelines such as Sarbanes-Oxley and HIPAA in the United States and the Data Privacy Directive in the European Union now specify increased data storage, retention, privacy, and protection requirements. As a consequence, the amount of information a company must store has vastly expanded. The challenge for companies is to understand how their data evolves, determine how it grows, monitor how its usage changes over time, and decide how long it should survive.

Richard J. Niemiec, CEO of TUSC, favors Oracle Database 11g for its ability to store massive amounts of information. "We're in the midst of a data explosion, and most companies are seeing their data needs rise exponentially," he says. "Oracle Database 11g can store 8 exabytes of information—8 million terabytes."

Oracle Information Lifecycle Management (ILM) can help solve these challenges of data management and storage. Oracle Information Lifecycle Management Assistant helps administrators design and implement an ILM strategy that aligns partitioned data to appropriate storage tiers, keeping more data online for longer in a cost-efficient manner. Oracle Database 11g offers advanced compression and partitioning capabilities that enable companies to implement an ILM strategy using multitier storage.

The data that companies must store on average triples every two years. Oracle Advanced Compression in Oracle Database 11g helps organizations reduce disk storage requirements, compressing data stored on-disk. It can lower costs by reducing disk space requirements for any type of data, including structured and unstructured data such as documents, images, and multimedia. Oracle Advanced Compression can be used with any type of database application without application changes, and disk space savings will cascade throughout the data center.

Oracle Partitioning enables tables and indexes to be split into smaller, more manageable components. Oracle Database 11g offers several partitioning methods, including range, interval, reference, and list, in addition to composite partitions of two methods such as order date (range) and region (list) or region (list) and customer type (list). This functionality enables faster query performance and administrative operations by executing at the smaller partition level. Oracle Partitioning doesn't require any application changes, and Oracle Database 11g offers a partitioning advisor to help administrators implement the right partitioning methodology for their business model.

"Whether you call it unstructured data, enterprise information, or content management, it's all about creating a richer experience for the end user, both on the internet and within the enterprise," says Vishu Krishnamurthy, Oracle's senior director of SQL, text, and XML development. "It used to take a lot of coding to combine various types of information. By consolidating data in a single repository, Oracle Database 11g enables a truly integrated information platform."

 



David Baum
(david@dbaumcomm.com) is a freelance business writer based in Santa Barbara, California.

Send us your comments