Sensitive supply chains and a data deluge from sensors and phones have analyst teams turning to big data warehouses in the cloud.
By Aaron Ricadela | February 2021
The global mining industry opens 2021 under the screws. Mining companies’ profits dropped last year as the coronavirus crisis pared production and forced them to reduce staff. At the same time, they’re under pressure from governments and investors to reduce carbon emissions, an often costly endeavor.
A Canadian technology company called MineSense is arming global mines, such as a giant Peruvian pit in the Andes owned by two of the world’s largest operators, with X-ray sensors and computers on electric shovels that can instantly analyze the quality of copper, zinc, iron, or other metals dug from the earth. MineSense’s system then funnels the data to machine learning models that work with Oracle Autonomous Data Warehouse in the cloud to help operators better sort ore from waste. They can pinpoint the location of resources they may have missed during exploration, improving subsequent dig plans and saving electricity, water, and money.
“In the past, to increase production, mines just put in bigger machines and moved more dirt,” says MineSense Chief Data Officer Frank Hoogendoorn. By examining every single bucket, MineSense’s data gives a more granular look at the amount of metal contained in the rock. Those insights can drive huge decisions for mine operators. “Some clients are looking at extending their mine life,” he says. “Our technology also fits into reducing emissions and sustainability. Those are big concerns, too.”
Hoogendoorn says he’s looked at data warehouses from other vendors, but considers Oracle’s stewardship of his customer’s data superior. Oracle technology also is helping MineSense statisticians shrink time to insight from minutes to seconds. Speed and security will matter as the Vancouver company looks to expand, including in South America this year, where rising copper prices are fueling demand for technologies that can increase production. “We take our role as data custodians very seriously,” he says. “I’m pretty bullish on Oracle as many companies that have on-premises stuff want to move their data warehouses to the cloud.”
In a pandemic economy in which businesses need to keep a close eye on sales amid volatile demand, track easily disrupted supply chains, and use machine learning to make decisions about reams of data pouring in from social media and networked vehicles and machines, data warehouses are back in vogue. The information-slurping systems tap databases, the web, business applications, and sensors to gather data in one place, giving companies a clearer view of operations and sales.
At their largest social media and web users, data warehouses have ballooned to several exabytes—the equivalent of a few million laptop hard drives. Even among more conventional users, petabyte-size data warehouses are more common. Buyers of cloud-based data warehouses are often hands-on users who specialize in analysis and AI, prize speed and low cost, and work in finance, operations, or marketing—not in IT. Cloud computing versions of the systems are forecast to grow about 10 times faster than on-premises software in coming years, drawing new battle lines among tech companies.
“What’s changing is you have business teams creating data warehouses in the cloud—they don’t need to buy hardware up front, and it’s easier to keep growing,” says Richard Winter, CEO of WinterCorp, an independent consultancy that specializes in advising businesses on buying and optimizing data warehouses. “That takes a big obstacle away.”
As new competitors emerge, Oracle is pressing advantages on price, ownership cost, and “elasticity”—the ability to for its Autonomous Data Warehouse to precisely add processing power to a workload when demand spikes, then take it away without leaving the customer stuck with unneeded capacity.
Sales of cloud computing versions of data warehouses and related tools are poised to grow by nearly 18% globally this year, and by 29% to $14.8 billion in 2022, according to market researcher IDC. “All the growth is in the cloud,” says IDC Software Analyst Philip Carnelley. Sales of data warehouse platforms that run in companies’ on-premises data centers are also poised to grow again to $15 billion by 2022, after two years of stasis.
Data warehouses have thrived since their rise in the 1990s by letting businesses distill petabytes of information for reports and more nuanced decisions. Insight into profitable customer segments, for example, can’t be gleaned by looking solely at sales: Costs incurred from suppliers, service, and returns also play a role. The systems can also be used to hunt fraudulent transactions at banks, predict when machines need maintenance, and research the spread of the coronavirus by combining test results with clinical notes and other information.
Cloud computing power is poised to open analytical opportunities to more users. Analysis teams are crafting more sophisticated queries as they consume live data from computers and phones to make decisions such as how much office space they need and where. 5G networks are relaying data about vehicles and factories from always-on sensors. And companies are reacting to market changes in real time, not based on data that’s hours or days old.
“These people care about, ‘I have this data; how fast can I load it and get it ready to run queries?’” says Rama Balaji, a master principal sales consultant at Oracle. “Talking about architectural detail is actually a negative thing.”
Oracle was the leader in relational data warehouses with 33% share in the first half of 2020, followed by Microsoft with 29% share, and IBM with 12%, according to IDC. Newcomers Amazon Web Services, Google, and Snowflake, which raised $3.4 billion in a September initial public offering, each had 3%. Databricks, another data warehouse company, is preparing an IPO that could come in the first half of next year.
“What’s changing is you have business teams creating data warehouses in the cloud—they don’t need to buy hardware up front, and it’s easier to keep growing. That takes a big obstacle away.”
Silicon Valley’s Snowflake in particular has stressed its customers’ ability to scale computing capacity up and down as needed, Snowflake’s availability on three different public clouds, and a data marketplace that lets companies publish datasets others can buy and analyze from right within the product.
In turn, Oracle points out that Snowflake customers need to specify predetermined “virtual warehouse” sizes and experiment with combinations of queries and computing capacity to get the best results. To scale up or out, Snowflake’s architecture makes users jump to the next cluster size, or spin up more clusters of the same size. That can run up cloud computing bills, Balaji says. “In today’s world, customers are very cost-sensitive,” she says.
Capacity planning may be less important for standard business-hours reporting, but it’s invaluable to manufacturers, retailers, and airlines that need to bring processors online immediately for peak computing loads like end-of-year production, sales, or booking spurts. “Oracle is among the best solutions in that way,” says analyst Winter. “If you have a workload that fluctuates a lot, then elasticity can save you money. These are powerful ideas, and they will be important to certain customers.”
What’s available as part of a base data warehouse subscription also impacts ownership costs. Snowflake customers can add thousands of dollars per user to their yearly bills through extra licenses required for analysis; reporting; and so-called extraction, transformation, and loading tools, according to a July study by market researcher Pique Solutions, sponsored by Oracle. With Oracle, all that is built in, as are tools for machine learning and graph databases, which can plot relationships among people and concepts, the study concluded.
“You want to be able to make decisions quickly across your enterprise,” says Keith Hoang, vice president of competitive marketing at Oracle. “You don’t need to bring in tools from other places to do your reporting.”
The Pique report, which polled a dozen businesses and IT service providers involved in data warehouse implementations, also found Oracle Autonomous Data Warehouse more mature than Snowflake in data security features, unstructured data support, and data migration. Oracle sales teams point to Snowflake clusters’ limits on the number of parallel queries users can run.
Oracle’s database also supports fast memory technology that keeps processors fed with information while storing data like a hard disk when a server is off.
A decade ago, tech companies including Oracle started helping customers to expand their data warehouse capabilities by adding “big data” Hadoop systems that could collect unstructured data such as emails and social media postings and load it into relational databases.
More recently, so-called data lakes provided a convenient way for users to file information without worrying too much about its structure, taking advantage of cheap cloud memory and storage. But data lakes proved persnickety when it came time to do analysis.
Oracle Autonomous Data Warehouse can support structured, unstructured, and semi-structured data. Autonomous Data Warehouse offers customers the choice of running the data warehouse as a typical cloud computing service, or in their data center, running on Oracle Exadata via the Cloud@Customer model, with cloud advantages such as pay-for-use pricing.
“Snowflake is a decent product, a good product, I think, and it’s just killing RedShift over at Amazon,” Oracle Chairman and Chief Technology Officer Larry Ellison said during the company’s earnings call in December. “But it doesn’t remotely compare to Oracle Autonomous Database.”
Photography: MineSense and aranozdemir/Getty images