In an industry first, Oracle makes lakehouse performance identical to database query performance
New 500TB TPC-H* benchmarks demonstrate 17X faster query performance vs. Snowflake; 36X faster than Google BigQuery; and 17X faster than DatabricksAustin, Texas—July 20, 2023
Oracle today announced the general availability of MySQL HeatWave Lakehouse, delivering an industry first by enabling customers to query data in object storage as fast as querying data inside the database. MySQL HeatWave Lakehouse supports a variety of object store file formats such as CSV, Parquet, and export files from other databases, and can combine object storage file data and MySQL database transactional data together in the same query. Object store files are queried directly by HeatWave without copying the data into the MySQL database. As a result, MySQL HeatWave Lakehouse sets new standards for scalability and performance for query processing, speed of loading data, cluster provisioning time, and automation to query data in object storage.
“More than 80 percent of data is stored in file systems and that number is growing. Customers want to integrate and analyze this varied external data with their internal transactional data, but it’s often too complex or too expensive to process,” said Edward Screven, chief corporate architect, Oracle. “MySQL HeatWave Lakehouse makes it easy for customers to get valuable real-time insights by combining their data in object storage with database data while gaining significantly higher query performance and much faster data loading at a lower cost.”
As demonstrated by a 10 TB TPC-H* benchmark, querying data in object storage in popular file formats with MySQL HeatWave Lakehouse is as fast as querying data in the MySQL database. This is made possible by MySQL Autopilot, a built-in capability of MySQL HeatWave that provides machine learning-powered automation, which learns from the execution of queries and improves the execution plan of future queries. MySQL Autopilot is an innovation in MySQL HeatWave that is not available anywhere else. MySQL HeatWave on Oracle Cloud Infrastructure (OCI) is powered by AMD EPYC™ processors.
“The AMD and MySQL HeatWave engineering teams are closely collaborating to optimize MySQL HeatWave for AMD EPYC processors to take advantage of new processor capabilities,” said Forrest Norrod, executive vice president and general manager of the Data Center Solutions Business Group, AMD. “Thanks to this collaboration, MySQL customers running MySQL HeatWave on AMD EPYC CPU-powered OCI instances benefit from an outstanding price performance advantage for their business-critical workloads, including real-time analytics on massive amounts of data stored in object storage.”
As demonstrated by a 500 TB TPC-H* benchmark, the query performance of MySQL HeatWave Lakehouse is:
The performance to load data from the object store with MySQL HeatWave Lakehouse is:
MySQL HeatWave’s unrivaled performance is a result of its scale-out architecture that enables massive parallelism to provision the cluster, load data, and process queries with up to 512 nodes. In addition, enhancements to MySQL Autopilot automate metadata creation for object files and dynamically adapt to the performance of the underlying object store to provide the best performance in any OCI region.
MySQL HeatWave is the only cloud service that provides transaction processing, real-time analytics, machine learning, data lake querying, and machine learning-based automation within a single MySQL database service. A core part of Oracle’s distributed cloud strategy, MySQL HeatWave is available in OCI, natively on Amazon Web Services, as part of the Oracle Database Service for Azure, and in customers’ data centers with OCI Dedicated Region.
“Data is growing exponentially and so is the amount of data we store in our data lake. The ability to use standard MySQL syntax to query data across our database and object storage to get real-time insights is very important for Natura,” said Fabricio Rucci, solution architect, Natura&Co. “This opens up new opportunities to explore and could represent new competitive advantages if we can analyze all this data faster than our competition.”
“HeatWave Lakehouse scales out very well for loading data from object storage and for running queries on object store,” said Henry Tullis, leader, Cloud Infrastructure and Engineering, Deloitte Consulting. “The load time and the query times are nearly constant as the size of the data grows and the HeatWave cluster size grows correspondingly. This scale out characteristic of HeatWave Lakehouse for data management is key to efficiently processing very large amounts of data.”
“It has been a given since Big Data has been around, that Big Data / Lakehouse queries are substantially slower than transactional queries,” said Holger Mueller, vice president and principal analyst, Constellation Research. “MySQL HeatWave ends that once and forever, demonstrating that Lakehouse performance can be identical to transaction query performance—unheard of and even unthinkable. With query performance parity, HeatWave allows CxOs to stop worrying where to put data and how to query it. The ‘secret sauce’ is HeatWave’s Autopilot that optimizes the queries. Once again, the HeatWave team has delivered an industry first.”
*Disclaimer: Benchmark queries are derived from the TPC-H benchmarks, but results aren’t comparable to published TPC-H benchmark results since these don’t comply with the TPC-H specifications.
Oracle offers integrated suites of applications plus secure, autonomous infrastructure in the Oracle Cloud. For more information about Oracle (NYSE: ORCL), please visit us at www.oracle.com.
Oracle, Java, MySQL and NetSuite are registered trademarks of Oracle Corporation. NetSuite was the first cloud company—ushering in the new era of cloud computing.
AMD, the AMD Arrow logo, EPYC, and combinations thereof are trademarks of Advanced Micro Devices, Inc.