By Jeff Erickson | June 2021
Developers reach for Oracle Autonomous Data Warehouse and Snowflake with the same goal in mind: to create a place to consolidate data from multiple sources so they can analyze it together. Both services run natively in the cloud, so there’s no hardware or software to install, configure, or maintain on in-house servers. Both trade on being easy to set up and use.
The similarities end there. While Snowflake Inc., operational since 2014 and publicly held since 2020, has been successful in a relatively short time for doing one thing very well, Oracle has been building capabilities into its converged database for decades based on technology trends and requests from its customers worldwide. It’s when a data architecture starts getting complex that the differences between the two start to show. Here are a couple of examples.
Snowflake, which runs on Amazon Web Services and Microsoft Azure, doesn’t support online transaction processing (OLTP).* As a result, every data point analyzed in Snowflake originates in a different database.
Why is that important? Because the data originates elsewhere, it has to be replicated, extracted, transformed (another word for massaging the data to conform to Snowflake’s data warehouse structure), and finally loaded into Snowflake’s data warehouse. This extract, transform, load (ETL) process requires significant investments in people, third-party software tools, and third-party services. But most of all, ETLs take a lot of time—so the data being analyzed might no longer be up to date.
“Their whole design is around inserting data only, which ignores the fact that in the real world data actually changes,” says George Lumpkin, Oracle VP of product management for data warehousing. “With Snowflake, I can do a simple read-only data warehouse with large queries and a small number of users. And I could use an Oracle Autonomous Database for that, too. But then you get to enterprise requirements, with thousands of users and really complicated workloads with lots of updates and real-time analytics, and you’ve gone way beyond the kind of vanilla data warehouse workloads Snowflake is designed for.”
“When you get to enterprise requirements, with thousands of users and really complicated workloads with lots of updates and real-time analytics, and you’ve gone way beyond the kind of vanilla data warehouse workloads Snowflake is designed for.”
Lumpkin’s advice: Get an easy-to-use cloud data warehouse, but keep complex future requirements in mind.
Snowflake also claims a mantle of “cloud native,” but their architecture echoes the same old hardware challenges of on-premises data centers. Snowflake provides compute resources in building blocks of 1, 2, 4, 8, 16, 32, 64, and 128 nodes.* For its customers, this means that if a workload grows from 16 nodes to 18 nodes, they have to buy 32 nodes. Suddenly they’re spending double even though they needed just 12.5% more capacity. “That’s not the kind of elasticity people move to the cloud for,” Lumpkin says.
OUTFRONT Media uses Oracle Autonomous Data Warehouse to combine information on more than 500,000 digital and static billboards. In just minutes, the company loads and merges terabytes of data and securely publishes interactive dashboards.
Oracle Autonomous Data Warehouse runs on Oracle Cloud Infrastructure and scales up or down precisely as needed—“if you have 16 CPUs to run a workload and need just two more CPUs to meet demand, you simply add two CPUs,” he says. Customers can also set the CPU count to automatically scale up when workloads increase, then scale down when no longer needed.
Oracle’s data management platform also benefits from having grown up before cloud companies could throw compute and storage at problems, Lumpkin notes. As a result, Oracle used its resources “very carefully,” optimizing the platform to get as much as possible out of the CPUs and memory, he says. “We’re able to get really solid performance, and we're not passing along the huge cost of hardware to the customers,” Lumpkin says.
To those two key differences—support for transactions and elastic scalability—let’s add one more thing: When it comes to ease of use for developers and IT teams, Autonomous Data Warehouse delivers more advantages of automated provisioning, configuring, securing, tuning, scaling, and backing up of the data warehouse—eliminating nearly all the manual and complex tasks that can introduce human error. And it’s available in both the Oracle Cloud and customers' data centers with Oracle Cloud@Customer, which Snowflake can’t match.
*See Analyst Mark Staimer’s April, 2021 paper on Snowflake and the Autonomous Data Warehouse
“…Snowflake is a pure data warehouse. It cannot access directly the data in transactional relational databases that represents 80% of the database market. It has to access this data from non-Snowflake sources.”
“…DBAs need to match their workloads to the kind of cluster that meets each workload needs. A cluster can have different fixed hardware sizes of 1 (X-Small), 2 (Small), 4 (Medium), 8 (Large), 16 (X-Large), 32 (2X-Large), 64 (3X-Large), or 128 (4X-Large) nodes.”