Oracle
Sitefinder
    WorldwideChange Country, Oracle Worldwide Web Sites
Secure Search

Interviews with IT leaders from Xerox Innovation Group, ABB Ltd., Loyola University Chicago, and other organizations yields ten principles for excellent data mining and management, including:

  • Start Small
  • Find the Right Partner
  • Sustain Enthusiasm
  • Find Meaning
As Published In

Profit Magazine
February 2004

COVER STORY

Eureka! Ten Best Things To Do with Your Data

By Kimberlee Roth

When it comes to the deluge of data your enterprise collects about every transaction it makes, you can do one of two things: Pray that Goethe was right, or get smart about it and make the mountains of data work for you instead of against you.

At the beginning of the Digital Age, everyone was intrigued by the idea that, once digitized, the contents of entire libraries could fit onto silicon-based chips no bigger than the head of a pin. Now, however, there are so many data "pins" that it seems no matter where we turn, we are constantly pricked by the volumes and volumes of data we've collected.

Yet the very same digital tools that made it possible for us to gather the information in the first place, when properly managed, also provide the key to mining and interpreting that data in ways that can enlighten and inform almost every business decision you make.

To see just how the best-of-the-best enterprises manage the oceans of data they gather virtually every second of every day, Profit asked four organizations to share the wealth—of information—regarding their best data mining and management practices.

1 START SMALL. ABB, Ltd., a US$18 billion global supplier of power and automation technologies to utilities and industry, divided its 18-month data warehousing initiative into four phases. The first phase was a pilot that, in close collaboration with account managers, culled and cleansed data from 60,000 top-tier customers, explains Gunther Lustig, product manager for Global Information Systems with the Zurich, Switzerland-based firm.

Revenue from these customers increased, despite a market slowdown, he explains, indicating that a larger-scale rollout would indeed be the right thing to do. "Don't start with a big bang," he says. "Let it grow. After you've completed the pilot, ask yourself from a technical and process point of view where the bottlenecks are. You may have to make trade-offs, depending on the granularity of the data you want and the implementation time for determining what needs to change if this is to be rolled out on a larger scale."

Tracy Thieret, a principal scientist with Xerox Innovation Group, a division of Xerox Corporation, which has US$15.8 billion in sales, US$8 billion of which are derived from postsales service, agrees. "Start with a fraction of the data. Testing happens quickly—our results come back in 30 minutes or less most of the time. Oracle Data Mining gives you tools for selecting data so that your sample is random and representative. You don't have to do it manually. Then once you discover interesting relationships in the subset, you can look at all the data to see if it still holds water."

2 THINK ALIGNMENT. Your data strategy should support your business strategy. Which data you collect, which data goes into the warehouse, and which mining experiments you begin with are all dictated by your organization's strategy and the business issues to be addressed. Once you realize the breadth and depth of the data available and what your warehousing and mining tools can do with it, it's easy to say, "Wow, let me at that," explains Mary Malliaris, an associate professor of information systems at Loyola University Chicago. Consequently, it's also easy to lose sight of your original business challenges.

With business alignment, if your business processes change (and they inevitably will) it will cost less and take less time for your infrastructure to adapt. How important is alignment? In CIO magazine's 2003 State of the CIO survey, CIOs cited aligning IT with business goals, along with prioritizing the demands of various business units, as their greatest challenge for the coming year. But most companies aren't able to achieve the alignment they desire—in October 2003, CFO Publishing, the publisher of CFO magazine, printed results of a survey, sponsored by Novell, that asked chief financial officers to rate the alignment between IT and business goals in their companies. The research, reported in Computerworld, found that 44 percent of CFOs described the alignment of business and IT as weak, and 4 percent said alignment of IT and business did not exist.

How can businesses achieve alignment? The most important variable is the CIO, who needs to be more than just a technologist and good manager. He or she must be part visionary, able to anticipate changes in business strategy before they happen, and set the stage for those change to be reflected in IT. And the CIO must percolate that vision down through the IT ranks, either by hiring people committed to the business, not just the technology, or by allowing employees to experience work life in other departments in the enterprise. You'll know that you've achieved alignment when the business—not IT—drives the business.

3 FIND THE RIGHT PARTNER. ABB chose Oracle products for their scalability and performance, according to Lustig. "Oracle was at the right level, from our point of view; we could fine-tune its performance, yet it's not as complex as IBM DB2 or other solutions."

The Vehicle and Operator Services Agency (VOSA), the U.K. government department for transport, looked to Oracle when creating a data warehouse, dubbed the Business Information System, which it mines for information about potentially noncompliant vehicle operators. VOSA was already operating on an Oracle platform and using Oracle applications for many of its operation systems, so Oracle's business intelligence tools seemed the best solution for data standardization, notes Peter Davies, a senior developer with the agency.

Oracle was also the solution for Woolwich Independent Financial Advisory Services (WIFAS, now a division of Barclays Financial Planning), because of an "almost immediate corporate cultural match," according to Alan Keegan, WIFAS commercial director. Oracle consultants were "pragmatic and interested in our business." Additionally, Oracle offered rapid application development, "which brought our Sales Management and Referral Tracking [SMART] system to life very nicely," he says. Also, Oracle's flexible, scalable products could be enhanced year after year, allowing WIFAS to build the system it had ultimately envisioned.

Researchers at Xerox chose Oracle Data Mining for functionality and ease of use, thanks in part to the Data Mining for Java add-in. Because Data Mining algorithms and the data are housed together in the Oracle database, "we don't have to move huge data sets to external programs to run the algorithms and learn something about our data," explains Thieret. "The fact that it cost about 75 percent less than the leading competitor didn't hurt either."

4 RECRUIT A STRONG TEAM. As an enterprisewide endeavor, data warehousing and mining shouldn't start and end in the far corner of the IT department. The tools should be easy and fast enough to use that your data analysis team can expand its scope beyond the technology. Keep it small, though, says Thieret—four to seven people, ideally. Include one or two domain experts who have a deep understanding of your products and services, someone who understands data-mining algorithms (seek a resource from your local university if need be), a senior management champion or someone who has the ear of senior management, and a communicator par excellence.

Keegan of WIFAS recruited a team of seven people and had them undergo training on Oracle Developer through Oracle Consulting. He then paired team members up one by one with Oracle's consultants. Keegan was paired with an Oracle project manager who was "instrumental in making sure we stayed on track," and they developed SMART together. The group sought input from business users across the firm, including the finance director, who designed SMART's finance screens; the sales director, who designed the sales reports; and an employee from sales support, who designed the business-entry screens.

5 PREPARE, PREPARE, PREPARE. "As with lots of things in science, preparation takes 80 percent of the time," says Xerox scientist Thieret. For ABB, preparation entailed the unique identification of global customers and suppliers so ABB could see a full view of each, locally and globally. Standardization of common codes—address codes, product codes, and the like—became a critical issue, explains Lustig. So much so that ABB dedicated a team and a database, the corporate master data service, to common coding. In some instances, the team found discrepancies in areas such as postal codes, entered with and without an International Organization for Standardization (ISO) country code; city spellings that varied, such as Vienna and Wien; characters and abbreviations such as São José and Sao Jose or St. used for both street and Saint; and data field content deviations such as city and postal code entered in the same field.

After standardizing addresses across 60 systems, according to Lustig, ABB found cases in which one master customer record corresponded to more than 30 entries across the different instances. Next, ABB used two third-party data-cleansing services for records that couldn't be matched automatically. Finally, staff from 10 countries collaboratively and manually cleansed data, with an eye for anomalies technology might miss. Lustig, for example, spotted some four-digit eastern German postal codes and knew they were missing a leading zero. "Nobody is as deeply in contact with customers and suppliers as the locals," he says. "You need that local know-how."

6 GET TO KNOW YOUR DATA. "You have to have a gut feel for your data," explains Malliaris of Loyola University. "That means talking to people who use it and asking them for the real story: 'When you look at this field, what's it really telling you?' or 'When you look at two customers, how do you decide who's a good one and who's not?'"

Malliaris, who uses data mining for stock forecasting, says one thing she learned quickly was the importance of derived values. In her own work, the percentage change in price, a derived or calculated value, can be more telling than the price itself. She's also learned the importance of another derived value, the number of nights a market is closed locally. "Traders like to trade. When there's a holiday and markets are closed for more than one night, their behavior on opening morning is different. Data miners have to look at behavior and be able to extract some calculated field that will influence the way their models are developed."

Similarly, when VOSA stops a vehicle operator for a missing taillight, the infraction is more serious than a missing mud flap but not as dangerous as faulty brakes or steering. So Davies and his team assigned weights to the data, depending on the severity of defects. And if a driver comes to an inspection facility annually with a vehicle that consistently has faults, the weight is even higher. In effect, operators receive a road safety score for their vehicles. The higher the score, the more noncompliant the operator.

7 SUSTAIN ENTHUSIASM. VOSA began its warehousing initiative with an early proof of concept by consulting end users to determine which reports and data they relied on. Then things "went quiet" for a while, explains Davies, while he and his team got the warehouse, the Business Information System, up and running. There was another flurry of enthusiasm when Oracle Portal was introduced and users gained laptop access to portlets customized for their geographic area of responsibility and job function. "It's a juggling act," says Davies. "Some users don't want to see anything until they can have 100 percent of the functionality, but that can take 12 months." Yet his team could provide 80 percent in one or two months. His suggestion? "Roll out 80 percent and then add more portlets and functionality at regular intervals to keep users enthused with more and more information."

Others, such as WIFAS, opted to roll out their system all at once. "If you've got to measure results, one thing your finance director doesn't want is to have half the company on one system and half on another," notes Keegan. "We were confident SMART would work, because those who would be users had tested it. For us, there was less risk to doing a big bang than opting for an incremental rollout." The system still keeps internal users and customers excited with continual enhancements. One such customer perk is My Portfolio, which lets customers compare investments and move funds among their accounts online. Soon they'll be able to access instant online valuations of all their investments with Barclays, regardless of the product, a feature that will further cement the company's relationship with its clients. The more interaction customers have with their products, the better, explains Keegan. "Interaction increases the popularity of the products and the amount of satisfaction customers get from them, so their propensity to invest more increases, too." Case in point: The average number of policies per customer has already increased by 29 percent as a result of SMART.

8 FIND MEANING. Once you start analyzing the data in your warehouse and conducting data mining experiments, you're likely to find correlations among variables you hadn't previously thought would have any connection. And you might discover correlations that, despite apparent statistical significance, turn out to be false. "Anytime you do exercises such as data mining, where you're not starting out with a hypothesis but rather are looking for associations, you'll find some spurious results," explains Peter Lenk, an associate professor of statistics and marketing at the University of Michigan Business School. "Even when there's nothing there, 5 to 10 percent of correlations may look important— it's inherent in the methodology."

So how can you distinguish meaningful results from statistical blips? "That's an ad hoc process," says Lenk, one that involves asking, "Can I make a business story out of it?" or "'Does this make sense, based on what I see every day?" The best indicator, he explains, is whether the findings have any predictive ability.

At Xerox, Thieret gave his staff and a summer intern, a Ph.D. candidate in data mining, 20 hypotheses to begin testing. "Between SQL and data mining queries, we're finding interesting results every two to three days," he notes. "But what we've got to understand is which correlations really tell us what's going on in terms of service for our machines—what's going to break when? What additional data do we need? Is it worthwhile to add another diagnostic sensor to a device?"

9 GIVE BACK. You might find "interesting factoids," says Lenk, but if you share that information without suggesting how it might be actionable, "people will get tired of hearing about it." At Xerox, Thieret shared some of his early findings with postsale business and operations strategists to spark conversations about how to incorporate important results into the operations.

At VOSA, roadside vehicle examiners can now run reports wherever they need to. "By bringing up data from our enforcement and testing databases, [the former captures information during random roadside stops and the latter from annual, scheduled vehicle inspections] inspectors come up with a complete picture of each operator," explains Davies. In the past, they relied on multiple monthly Lotus spreadsheets, sent via e-mail from the central office. The feedback has been positive. "Vehicle examiners know who their noncompliant operators are. Our work has confirmed what inspectors may have thought but never had the statistical proof to support. It gives them the basis for developing a structured approach to help the operators improve."

10 MEASURE. VOSA's efforts are still in progress; there are plans for additional users to access the Business Information System through the agency's Web browser within the next year. And not all of the agency's data has been added to the warehouse. Consequently, some reports are still produced by a few users who bring data together into Excel spreadsheets that are published to the warehouse as content. Still, the initiative has already paid off. Nearly a dozen systems feed into the warehouse, improving the agency's targeting of noncompliant operators and allowing it to monitor progress against targets set by the British government, targets that VOSA must meet in order to receive maximum funding for enforcement work.

WIFAS has already seen a payoff from its efforts. It launched SMART in 1999 and has achieved a 260 percent return on its investment. Its customer base has tripled, and the number of financial advisers supported by the same sales support staff has more than doubled. "Not only wasn't I laying people off, but we were even increasing our number of new-business writers," Keegan notes. He expects to almost double the number of advisers, to nearly 800, by the end of 2003, and SMART is operating on a small scale in Luxembourg as well. The number of financial advisers assigned to each manager has increased, thanks to SMART's ability to provide monthly management reports within minutes instead of the three hours it used to take. The system has given WIFAS a competitive advantage during the flat market of the past few years, adds Keegan. "We've done better than our competitors because we've gone back to customers who've demonstrated a propensity to invest with us. We've kept their faith, so to speak."

"Customer satisfaction is the key observable," notes Thieret of Xerox. "For us, the most significant component of satisfaction is device availability versus downtime. Predictive diagnostics increase availability for customers and save money for Xerox by optimizing service technician routes." And replacing parts at just the right time maximizes a machine's useful life. "Even a 10 percent improvement in field life means big bucks," says Thieret.

Lustig expects that ABB's Global Identification Service will easily reap benefits 10 times as great as its cost. "I'm satisfied with that," he says, "but we're aiming for more." Some time savings have already been realized. "If an account manager used to take one day a month to get a complete overview of his customers and now can do it with one button click, that saves a lot of time." And now that all international suppliers' local sites are linked to unique master records, supply chain managers can see when the company is paying two different prices for the same product from the same source.

But the real bottom line for Lustig involves little computation. "It's not a question of whether it pays back but whether you can do business the way you have for the past 10 years, in light of new challenges. We simply cannot do business without this. It's like using a calculator—no one will stop you and ask you why you're not using paper and pencil."

Kimberlee Roth also contributes to the Chicago Tribune and Pepper and Rogers Group's INSIDE 1to1.

Intelligent Products

Use these Oracle business intelligence (BI) products to enlighten and inform almost every operational, tactical, and strategic decision you make.

Oracle Application Server Discoverer 10g
Discoverer is an intuitive ad hoc query, reporting, analysis, and Web-publishing tool that gives users at any level in your enterprise immediate access to information from data marts, data warehouses, online transaction processing systems, and Oracle E-Business Suite applications.

Oracle Business Intelligence Beans
Oracle Business Intelligence Beans is a set of standards-based JavaBeans that provide analysis-aware application building blocks for building, extending, and customizing BI applications quickly and easily.

Oracle Data Mining 10g
Oracle Database provides embedded data mining capabilities to help decision-makers quickly sift through massive amounts of corporate data to uncover hidden patterns and predict potential outcomes. Oracle Data Mining 10g expands the range of mining functionality to include support for new domains including text mining and bioinformatics.

Oracle Database 10g
If you're using Oracle Database for transaction processing, it's also the best database for your data warehouse and analytical processing. No matter what your sources are, Oracle Database increases data quality and the speed at which you access, analyze, and share information. Oracle Database 10g is the first database designed for enterprise grid computing, the most flexible and cost-effective way to manage enterprise information by using low-cost hardware and clustering servers together to act as a single, large computer.

Oracle Locator and Oracle Spatial
Oracle Locator stores, indexes, and manages query location relationships and location content (assets, buildings, roads, land parcels, sales regions) in an Oracle database. Oracle Spatial adds advanced spatial information management features such as linear network data models, topology, GeoRaster data, and built-in geocoding.

Oracle Reports 10g and Application Server Portal 10g
With Oracle Reports, your developers create high-quality reports from any data source and distribute the results in any format with security and scalability. Oracle Portal offers a wizard-based environment for creating portal Web interfaces for efficient BI access and distribution.

Oracle Warehouse Builder 10g
Consolidate data from diverse data sources with Oracle Warehouse Builder, an enterprise BI integration design tool that manages the full lifecycle of data and metadata. Warehouse Builder features click-and-drag mappings, wizard-driven user interfaces, a library of predefined transformations, and embedded data quality features to ensure that the information managed in the data warehouse is complete and reliable.

Send your DBAs and developers to...

Oracle Application Server Discoverer 10g

Oracle Application Server Portal 10g

Oracle Business Intelligence Beans

Oracle Data Mining 10g

Oracle Database 10g

Oracle Locator and Oracle Spatial

Oracle Reports 10g

Oracle Warehouse Builder 10g

email this page E-mail this page printer view Printer View
Oracle Is The Information Company About Oracle | Oracle RSS Feeds | Subscribe | Careers | Contact Us | Site Maps | Legal Notices | Terms of Use | Your Privacy Rights