Increases MySQL performance by orders of magnitude for analytics and mixed workloads. Eliminates the need for a separate analytics database, separate machine learning (ML) tools, and extract, transform, and load (ETL) duplication.
Oracle MySQL HeatWave is the only MySQL cloud service with a built-in, high performance, in-memory query accelerator—HeatWave. It increases MySQL performance by orders of magnitude for analytics and mixed workloads, without any changes to current applications. With HeatWave enabled, MySQL HeatWave is 6.5X faster than Amazon Redshift at half the cost, 7X faster than Snowflake at one-fifth the cost, and 1,400X faster than Amazon Aurora at half the cost. Customers run analytics on data that’s stored in MySQL databases without a separate analytics database and ETL duplication.
With MySQL HeatWave ML, developers and data analysts can build, train, deploy, and explain machine learning models in MySQL HeatWave, without moving data to a separate machine learning service. Benchmarks demonstrate that, on average, HeatWave ML produces more accurate results than Amazon Redshift ML, trains models 25X faster at 1% of the cost, and scales as more nodes are added.
This fintech startup from Saudi Arabia moved its database workloads to MySQL HeatWave for 3X greater performance and 60% lower costs than another cloud provider. Tamara has grown its client base to more than two million users and onboarded 3,000 merchants.
Nipun Agarwal, Senior Vice President, MySQL HeatWave Development, Oracle, demonstrates how the new MySQL Autopilot increases the performance of HeatWave while saving significant time for developers and DBAs.
Oracle Senior Vice President Nipun Agarwal demonstrates the new, real-time elasticity capabilities in Oracle MySQL HeatWave. See how you can scale up or down to any number of nodes without any downtime, ending the operation with a completely balanced cluster without any manual intervention.
HeatWave uses a columnar in-memory representation that facilitates vectorized processing. The data is encoded and compressed prior to being loaded in memory. This compressed and optimized in-memory representation is used for both numeric and string data. This results in significant performance improvements and a reduced memory footprint, translating into reduced costs for customers.
One of the key design points of the HeatWave engine is to massively partition data across a cluster of HeatWave nodes, which can be operated in parallel. This enables high cache hits for analytic operations and provides very good inter-node scalability. Each HeatWave node within a cluster and each core within a node can process partitioned data in parallel, including parallel scans, joins, group-by, aggregation and top-k processing.
Modifications made by OLTP transactions are propagated in real-time to HeatWave and immediately visible for analytics queries. Once users submit a query to the MySQL database, the MySQL query optimizer transparently decides if the query should be offloaded to the HeatWave cluster for accelerated execution. This is based on whether all operators and functions referenced in the query are supported by HeatWave and if the estimated time to process the query with HeatWave is less than with MySQL. If both conditions are met, the query is pushed to HeatWave nodes for processing. Once processed, the results are sent back to the MySQL database node and returned to users. .
HeatWave implements state-of-the-art algorithms for distributed in-memory analytic processing. Joins within a partition are processed fast by using vectorized build and probe join kernels. The highly optimized network communication between analytics nodes is achieved by using asynchronous batch I/Os. The algorithms are designed to overlap compute time with communication of data across nodes, which helps achieve high scalability.
MySQL Autopilot automates many of the most important and often challenging aspects of achieving high query performance at scale—including provisioning, data loading, query execution, and failure handling. It uses advanced techniques to sample data, collect statistics on data and queries, and build machine-learning models to model memory usage, network load, and execution time. These machine-learning models are then used by MySQL Autopilot to execute its core capabilities. MySQL Autopilot makes the HeatWave query optimizer increasingly intelligent as more queries are executed, resulting in continually improving system performance over time—a capability not available on Amazon Aurora, Amazon Redshift, Snowflake, or other MySQL-based database services. MySQL Autopilot is available at no additional charge for MySQL HeatWave customers.
When data is loaded from MySQL into HeatWave, a copy of the in-memory representation is made to the scale-out data management layer built on the OCI object store. Changes made to data in MySQL are transparently propagated to this data layer. When an operation requires reloading of data to HeatWave, such as during error recovery, data can be accessed from the HeatWave data layer, in parallel, by multiple HeatWave nodes. This results in a significant improvement in performance. For example, for a 10 TB HeatWave cluster, the time it takes to recover and reload data reduces from 7.5 hours to 4 minutes—an improvement of over 100X.
HeatWave is designed as a MySQL pluggable storage engine, which completely shields all the low-level implementation details from customers. As a result, applications and tools seamlessly access HeatWave through MySQL, using standard connectors. HeatWave supports the same ANSI SQL standard and ACID properties as MySQL, and supports diverse data types. This enables existing applications to take advantage of HeatWave without any changes.
On-premises customers who cannot move their MySQL deployments to a cloud due to compliance or regulatory requirements can still leverage HeatWave by using the hybrid deployment model. In such a hybrid deployment, customers can use MySQL replication to replicate on-premises MySQL data to HeatWave without the need for ETL.
With native, in-database machine learning in MySQL HeatWave, available at no extra cost, users don’t need to move data to a separate machine learning service such as Amazon SageMaker—accelerating their ML initiatives, increasing security, and reducing costs. HeatWave ML automates the machine learning lifecycle, including algorithm selection, intelligent data sampling for model training, feature selection, and hyperparameter tuning—saving customers significant time and effort. Developers and data analysts can build machine learning models using familiar SQL commands; they don’t have to learn new tools and languages. Additionally, HeatWave ML is integrated with popular notebooks such as Jupyter and Apache Zeppelin. HeatWave ML delivers predictions with an explanation of the results, helping organizations with regulatory compliance, fairness, repeatability, causality, and trust. Benchmarks demonstrate that, on average, HeatWave ML produces more accurate results than Amazon Redshift ML, trains models up to 25X faster at 1% of the cost, and scales as more nodes are added.
Real-time elasticity enables customers to increase or decrease the size of their HeatWave cluster by any number of nodes without incurring any downtime or read-only time. The resizing operation takes only a few minutes, during which time HeatWave remains online, available for all operations. Once resized, data is downloaded from object storage, automatically repartitioned among all available cluster nodes, and becomes immediately available for queries. As a result, customers benefit from consistently high performance, even at peak times, and reduce costs by downsizing their HeatWave cluster when appropriate—without incurring any downtime or read-only time. Customers aren’t constrained to overprovisioned instances forced by rigid sizing models offered by other cloud database providers. They can expand or downsize their HeatWave cluster to any number of nodes—and pay only for the exact resources they use.
Data compression in the HeatWave cluster allows each node to process up to 2X more data without any degradation in price performance for queries. With data compression, customers can reduce the number of HeatWave nodes needed to process queries and cut their costs by up to 50%—while maintaining a constant price performance ratio. Compressed data in the HeatWave cluster is persistent in object storage.