Transactions, analytics across data warehouses and data lakes, and machine learning in one cloud database service

MySQL HeatWave is a fully managed database service, powered by the HeatWave in-memory query accelerator. It’s the only cloud service that combines transactions, real-time analytics across data warehouses and data lakes, and machine learning in one MySQL Database—without the complexity, latency, risks, and cost of ETL duplication.

With MySQL HeatWave Lakehouse, customers can query half a petabyte of data in object storage and leverage all the benefits of HeatWave, even when their data is stored outside a MySQL Database. With HeatWave AutoML, developers and data analysts can build, train, deploy, and explain machine learning models in MySQL HeatWave without moving data to a separate machine learning service.

Demo: MySQL HeatWave Lakehouse

See how to process and query hundreds of terabytes of data in the object store in a variety of file formats, such as CSV, Parquet, and export files from other databases.

First Principles: Inside MySQL HeatWave Lakehouse on OCI

Discover the novel techniques that power MySQL HeatWave Lakehouse, letting users process and query half a petabyte of data in memory from object storage.

Demo: MySQL HeatWave Lakehouse on AWS

See how easy it is to use MySQL HeatWave Lakehouse on AWS to analyze queries in real time and perform machine learning on hundreds of terabytes of data from object storage, the database, or both.

Demo: MySQL Autopilot

See how MySQL Autopilot increases the performance of HeatWave while saving significant time for developers and DBAs.

Demo: Support for Generative AI and Vector Store with MySQL HeatWave

See how support for generative AI and vector store will let you interact with MySQL HeatWave in natural language and use large language models (LLMs) with proprietary data for better accuracy.

Demo: MySQL HeatWave performance

See the query performance and price-performance comparisons from McKnight Consulting Group's 100 TB TPC-H benchmark of MySQL HeatWave versus Snowflake, Amazon Redshift, Databricks, and Google BigQuery.

Migrate to MySQL HeatWave with free step-by-step resources.

MySQL HeatWave: Architected for performance and scalability

Cloud database designed for performance and scalability

HeatWave uses a columnar in-memory representation that facilitates vectorized processing. The data is encoded and compressed prior to being loaded in memory. This compressed and optimized in-memory representation is used for both numeric and string data. This results in significant performance improvements and a reduced memory footprint, translating into reduced costs for customers.


In-memory hybrid columnar processing diagram, description below

This depicts columnar in-memory representation that facilitates vectorized processing for HeatWave. The data is encoded and compressed prior to being loaded in memory. This image shows how this process provides horizontal scalability and multi-core scalability.

Scalability across cores and nodes

One of the key design points of the HeatWave engine is to massively partition data across a cluster of HeatWave nodes, which can be operated in parallel. This enables high cache hits for analytic operations and provides very good inter-node scalability. Each HeatWave node within a cluster and each core within a node can process partitioned data in parallel, including parallel scans, joins, group-by, aggregation and top-k processing.


Massively parallel architecture diagram, description below

This diagram is a technical illustration of MySQL HeatWave’s massively parallel architecture. It shows how data is the input, which is then partitioned so that query processing can occur on multiple CPU cores. This then provides results faster.

Real-time analytics

Modifications made by OLTP transactions are propagated in real-time to HeatWave and immediately visible for analytics queries. Once users submit a query to the MySQL database, the MySQL query optimizer transparently decides if the query should be offloaded to the HeatWave cluster for accelerated execution. This is based on whether all operators and functions referenced in the query are supported by HeatWave and if the estimated time to process the query with HeatWave is less than with MySQL. If both conditions are met, the query is pushed to HeatWave nodes for processing. Once processed, the results are sent back to the MySQL database node and returned to users.


Automated query offloading diagram, description below

This diagram shows how real-time analytics are processed when using HeatWave. It shows how the changes made by OLTP transactions in the MySQL Database are propagated in real-time to the HeatWave engine. Once users submit a query to their MySQL database, the MySQL query optimizer decides if the query should be offloaded to the HeatWave cluster for accelerated execution. Once processed, the results are sent back to the MySQL database node and returned to users in the form of query results.

Distributed in-memory analytics processing for real-time insights

HeatWave implements state-of-the-art algorithms for distributed in-memory analytic processing. Joins within a partition are processed fast by using vectorized build and probe join kernels. The highly optimized network communication between analytics nodes is achieved by using asynchronous batch I/Os. The algorithms are designed to overlap compute time with communication of data across nodes, which helps achieve high scalability.


Algorithms for distributed analytic processing diagram, description below

This diagram shows how execution time for an analytic query would be 13 seconds without overlap. With overlap network time with compute, the execution time for the same query would be 8 seconds, saving 5 seconds thanks to state-of-the-art algorithms for distributed in-memory analytic processing.

MySQL Autopilot: Machine learning-powered automation

MySQL Autopilot automates many of the most important and often challenging aspects of achieving high query performance at scale—including provisioning, data loading, query execution, and failure handling. It uses advanced techniques to sample data, collect statistics on data and queries, and build machine-learning models to model memory usage, network load, and execution time. These machine-learning models are then used by MySQL Autopilot to execute its core capabilities. MySQL Autopilot makes the HeatWave query optimizer increasingly intelligent as more queries are executed, resulting in continually improving system performance over time—a capability not available on Amazon Aurora, Amazon Redshift, Snowflake, or other MySQL-based database services. MySQL Autopilot also provides capabilities designed to improve the performance and price-performance of OLTP workloads. MySQL Autopilot is available at no additional charge for MySQL HeatWave customers.


MySQL Autopilot diagram, description below

This diagram depicts all of the aspects of data management that are automated by MySQL Autopilot across 4 areas: System setup, data loading, Failure handling, and query execution. The image illustrates that MySQL autopilot is a part of the “magic” of this product, in that it automates so many different challenging aspects of database management for you, and acts as an advisor in many cases. Here are the list of capabilities listed under each of these 4 areas:

  1. Under “System setup” it says: Auto provisioning, auto shape prediction, auto schema inference, adaptive data sampling.
  2. Under “Failure Handling” it says: Auto error recovery.
  3. Under “Data load” it says: auto parallel loading, auto data placement, auto encoding, auto unload, auto compression, adaptive data flow.
  4. Under “Query execution” it says: Auto scheduling, auto change propagation, auto query time estimation, auto query plan improvement, adaptive query execution, auto thread pooling.

100X faster data recovery

When data is loaded from MySQL into HeatWave, a copy of the in-memory representation is made to the scale-out data management layer built on the OCI object store. Changes made to data in MySQL are transparently propagated to this data layer. When an operation requires reloading of data to HeatWave, such as during error recovery, data can be accessed from the HeatWave data layer, in parallel, by multiple HeatWave nodes. This results in a significant improvement in performance. For example, for a 10 TB HeatWave cluster, the time it takes to recover and reload data reduces from 7.5 hours to 4 minutes—an improvement of over 100X.


Scale out data management diagram, description below

This image is a technical representation depicting how HeatWave can ensure faster data recovery. It shows you that when data is loaded from MySQL into HeatWave, a copy of the in-memory representation is made to the scale-out data management layer built on the OCI object store, and any changes you make to your data in MySQL are also made to this data layer, so it’s always up to date. This graphic shows how the ability of HeatWave to access this scale-out data management layer on the OCI object store makes it much faster to reload data, for example.

No changes required to applications with built-in analytics

HeatWave is designed as a MySQL pluggable storage engine, which completely shields all the low-level implementation details from customers. As a result, applications and tools seamlessly access HeatWave through MySQL, using standard connectors. HeatWave supports the same ANSI SQL standard and ACID properties as MySQL, and supports diverse data types. This enables existing applications to take advantage of HeatWave without any changes.


Native MySQL analytics diagram, description below

This diagram shows that MySQL HeatWave has automatic, real-time data propagation between the MySQL innoDB storage engine for OLTP and the HeatWave analytics cluster for OLAP. The diagram also shows that both OLTP applications like ones used for ecommerce, social media, FinTech or SaaS applications, as well as OLAP, BI, and Analytics applications like Oracle analytics cloud, Tableau, Qlik, and Looker, can seamlessly access HeatWave through MySQL using standard connections. The image shows that no changes are required to make in your apps to do this.

Hybrid cloud—OLTP on-premises, OLAP in the cloud

On-premises customers who can’t move their MySQL deployments to a cloud because of compliance or regulatory requirements can still leverage HeatWave by using the hybrid deployment model. In such a hybrid deployment, customers can use MySQL replication to replicate on-premises MySQL data to HeatWave without the need for ETL.


Hybrid deployment diagram, description below

Hybrid cloud—OLTP on-premises, OLAP in the cloud

This graphic shows that on-premises customers who can’t move their MySQL deployments to a cloud because of compliance or regulatory requirements can still leverage HeatWave by using the hybrid deployment model. This depicts the hybrid deployment, where customers can use MySQL replication to replicate their on-premises MySQL data to HeatWave without the need for cumbersome ETL.

In-database machine learning with AutoML

With in-database machine learning in MySQL HeatWave, available at no extra cost, users don’t need to move data to a separate machine learning service such as Amazon SageMaker—accelerating their ML initiatives, increasing security, and reducing costs. They can apply machine learning training, inference, and explanation to data stored both inside MySQL and in the object store with HeatWave Lakehouse. HeatWave AutoML automates the machine learning lifecycle, including algorithm selection, intelligent data sampling for model training, feature selection, and hyperparameter tuning—saving customers significant time and effort.

Developers and data analysts can build machine learning models using familiar SQL commands; they don’t have to learn new tools and languages. Additionally, HeatWave AutoML is integrated with popular notebooks such as Jupyter and Apache Zeppelin. HeatWave AutoML delivers predictions with an explanation of the results, helping organizations with regulatory compliance, fairness, repeatability, causality, and trust. Benchmarks demonstrate that, on average, HeatWave AutoML produces more accurate results than Amazon Redshift ML, trains models up to 25X faster at 1% of the cost, and scales as more nodes are added.


AutoML diagram, description below

The diagram illustrates the capabilities of MySQL HeatWave's in-database machine learning feature, highlighting its benefits for users. This diagram has 3 boxes. The first box on the left depicts “inputs” and has the logos for Jupyter and Apache Zeppelin. This is meant to illustrate HeatWave AutoML is integrated with popular notebooks such as Jupyter and Apache Zeppelin. There is an arrow from the first box to the second box. The second box depicts MySQL HeatWave. There are images of brains, charts, and gears in the second box, with words stating that you can perform training, inference, and explanation on machine learning models using MySQL HeatWave. There is an arrow pointing from this 2nd box to the third box. The third box depicts “outputs” which is represented by a lightbulb, with words saying “prediction” and “explanation”. Put together, these 3 boxes make a chart that shows how you can easily performing the full lifecycle of machine learning tasks using the tools you already know in order to get deep insights and predictions about your business.

Generative AI with MySQL HeatWave vector store

Currently in private preview, the vector store will enable customers to leverage the power of large language models (LLMs) with their proprietary data to get answers that are more accurate than using models trained on only public data. With generative AI and vector store capabilities, customers can interact with MySQL HeatWave in natural language and efficiently search documents in various file formats in HeatWave Lakehouse.

The vector store ingests documents in a variety of formats, including PDF, and stores them as embeddings generated via an encoder model. For a given user query, the vector store identifies the most similar documents by performing a similarity search against the stored embeddings and the embedded query. These documents are used to augment the prompt given to the LLM so that it provides a more contextual answer.


GenAI with vector store private preview diagram, description below

This graphic is a technical illustration depicting how MySQL HeatWave works with GenAI and Vector store. This demonstrates how the user can query and retrieve information in natural language from their company’s own proprietary enterprise documents. The vector store ingests proprietary documents in a variety of formats, including PDF, and stores them as embeddings generated via a language encoder model. For a given user query, the vector store identifies the most similar documents by performing a similarity search against the stored embeddings and the embedded query. These documents are used to augment the prompt given to the LLM so that it provides a more contextual answer to the user, in natural language.

Consistent high performance and reduced costs with no downtime

Real-time elasticity enables customers to increase or decrease the size of their HeatWave cluster by any number of nodes without incurring any downtime or read-only time. The resizing operation takes only a few minutes, during which time HeatWave remains online, available for all operations. Once resized, data is downloaded from object storage, automatically repartitioned among all available cluster nodes, and becomes immediately available for queries. As a result, customers benefit from consistently high performance, even at peak times, and reduce costs by downsizing their HeatWave cluster when appropriate—without incurring any downtime or read-only time. Customers aren’t constrained to overprovisioned instances forced by rigid sizing models offered by other cloud database providers. With efficient data reloading from object storage, customers can also pause and resume their HeatWave cluster to reduce costs.


Real-time elasticity diagram, description below

This image compares two processes: manual resizing of the HeatWave cluster versus Real-time elastic resizing. It shows that HeatWave’s Real-time elasticity feature enables customers to increase or decrease the size of their HeatWave cluster by any number of nodes without incurring any downtime or read-only time, unlike manual resizing. It shows that you can load base tables on all nodes, and metadata is synced so that you can start admitting queries to the larger sized node right away without downtime.

Fast analytics across databases and object storage

MySQL HeatWave Lakehouse enables users to query half a petabyte of data in object storage—in a variety of file formats, such as CSV, Parquet, Avro, and export files from other databases. The query processing is done entirely in the HeatWave engine, enabling customers to take advantage of HeatWave for non-MySQL workloads in addition to MySQL-compatible workloads. Customers can query data in various formats in object storage, transactional data in MySQL databases, or a combination of both using standard SQL commands. Querying the data in object storage is as fast as querying the databases. With HeatWave AutoML, customers can use data in object storage, the database, or both to automatically build, train, deploy, and explain ML models—without moving the data to a separate ML cloud service. The HeatWave cluster scales to 512 nodes to process half a petabyte of data, and the data isn’t copied to the MySQL database.

As demonstrated by a 500 TB TPC-H benchmark, the query performance of MySQL HeatWave Lakehouse is 15X faster than Amazon Redshift, 18X faster than Databricks and Snowflake, and 35X faster than Google BigQuery. The load performance of MySQL HeatWave Lakehouse is 2X faster than Snowflake, 6X faster than Databricks, 8X faster than Google BigQuery, and 9X faster than Amazon Redshift.


Lakehouse diagram, description below

This is a diagram showing how easy it is to process data housed in object storage using HeatWave Lakehouse. There are 3 boxes in this diagram. The box on the left represents MySQL Database with the InnoDB Storage Engine inside, the middle box represents the MySQL HeatWave cluster, which is made up of multiple HeatWave nodes that perform query processing, machine learning, and more. On the right hand side, the box represents the object store. In this box there are images depicting various file formats in object storage, including AVRO, CSV files, Parquet files, and database exports from Amazon Aurora and Amazon Redshift. There are double headed arrows pointing between each of the 3 boxes, showing how the data is shared and the processing is done. The arrows show that you can have both your data from your MySQL Database and your data from object storage easily analyzed by the MySQL HeatWave engine.

MySQL HeatWave customer success stories

Tamara scales fast with MySQL HeatWave and Oracle Cloud

This fintech startup from Saudi Arabia moved its database workloads to MySQL HeatWave for 3X greater performance and 60% lower costs than another cloud provider. Tamara has grown its client base to more than 2 million users and onboarded 3,000 merchants.

6D Technologies demystifies data and analytics with MySQL HeatWave on AWS

This global high-tech solution provider in the telecom industry speeds up complex queries by 139X with MySQL HeatWave on AWS compared to Amazon RDS and Aurora—simplifying its infrastructure for OLTP and OLAP while delivering subsecond response times to customers.

FANCOMI accelerates ad analytics by 10X with MySQL HeatWave

Japan’s leading advertising network delivers real-time insights and significantly reduces costs with MySQL HeatWave and Autonomous Database.

Get hands-on with MySQL HeatWave on OCI or AWS

Explore MySQL HeatWave


Try MySQL HeatWave for free

Explore with $300 in free credits.


Contact us

Interested in learning more? Contact one of our experts.