HeatWave ML fully automates model training, inference, and explanation
HeatWave ML is 25 times faster than Amazon Redshift ML at one percent of the costAustin, Texas—2022年3月29日
Oracle today announced that Oracle MySQL HeatWave now supports in-database machine learning (ML) in addition to the previously available transaction processing and analytics—the only MySQL cloud database service to do so. MySQL HeatWave ML fully automates the ML lifecycle and stores all trained models inside the MySQL database, eliminating the need to move data or the model to a machine learning tool or service. Eliminating ETL reduces application complexity, lowers cost, and improves security of both the data and the model. HeatWave ML is included with the MySQL HeatWave database cloud service in all 37 Oracle Cloud Infrastructure (OCI) regions.
Until now, adding machine learning capabilities to MySQL applications has been prohibitively difficult and time consuming for many developers. First, there is the process of extracting data out of the database and into another system to create and deploy ML models. This approach creates multiple silos for applying machine learning to application data and introduces latency as data moves around. It also leads to the proliferation of data out of the database, making it more vulnerable to security threats, and adds complexity for developers to program in multiple environments. Second, existing services expect developers to be experts in guiding the ML model training process; otherwise, the model is sub-optimal, which degrades the accuracy of predictions. Finally, most existing ML solutions don’t include functionality to provide explanations about why the models that developers build deliver specific predictions.
MySQL HeatWave ML solves these problems by natively integrating machine learning capabilities inside the MySQL database, eliminating the need to ETL the data to another service. HeatWave ML fully automates the training process and creates a model with the best algorithm, optimal features, and the optimal hyper-parameters for a given data set and a specified task. All models generated by HeatWave ML can provide model and prediction explanations.
No other cloud database vendor provides such advanced ML capabilities directly inside their database service. Oracle published ML benchmarks performed across a large number of publicly available machine learning classification and regression datasets such as Numerai, Nomao, and Bank Marketing, among others. On average, on the smallest cluster, HeatWave ML trains machine learning models 25 times faster at one percent of the cost of Redshift ML. Additionally, the performance advantage over Redshift ML increases when training is done on a larger HeatWave cluster. Training is a time-consuming process and since it can be done very efficiently and rapidly with MySQL HeatWave, customers can now retrain their models more often and keep up with changes to data. This keeps the models up-to-date and improves the accuracy of predictions.
“Just as we integrated analytics and transaction processing within a single database, we are now bringing machine learning inside MySQL HeatWave,” said Edward Screven, chief corporate architect, Oracle. “MySQL HeatWave is one of the fastest growing cloud services at Oracle. An increasing number of customers have migrated from Amazon and other cloud database services to MySQL HeatWave and have gained significant performance improvements and lower costs. Today, we are also announcing a number of other innovations which enrich HeatWave’s capabilities, improve availability, and lower the cost. Our new and fully transparent benchmark results again demonstrate that Snowflake, AWS, Microsoft, and Google are slower and more expensive than MSQL HeatWave by a large margin.”
HeatWave ML offers the following capabilities compared to other cloud database services:
Fully Automated Model Training: All of the different stages in creating a model with HeatWave ML are fully automated and do not require any intervention from developers. This results in a tuned model which is more accurate, requires no manual work, and the training process is always completed. Other cloud database services such as Amazon Redshift provide integration with machine learning capabilities in external services, which require extensive manual inputs from developers during the ML training process.
Model and Inference Explanations: Model explainability helps developers understand the behavior of a machine learning model. For example, if a bank denies a client a loan, the bank needs to be able to determine which parameters of the model were taken into account, or if the model contains any bias. Prediction explainability is a set of techniques that help answer the question of why a machine learning model made a specific prediction. Prediction explanations are becoming increasingly important these days as companies must be able to explain the decisions made by their machine learning models. HeatWave ML integrates both model explanation and prediction explanations as a part of its model training process. As a result, all models created by HeatWave ML can offer model as well as inference explanations without the need of training data at inference explanation time. Oracle has augmented existing explanation techniques to improve performance, interpretability, and quality. Other cloud database services do not offer such rich explainability for all of their machine learning models.
Hyper-Parameter Tuning: HeatWave ML implements a new gradient search-based reduction algorithm for hyper-parameter tuning. This enables the hyper-parameter search to be executed in parallel without compromising the model accuracy. Hyper-parameter tuning is the most time-consuming stage of ML model training, and this unique capability provides HeatWave ML with a significant performance advantage over other cloud services for building machine learning models.
Algorithm Selection: HeatWave ML uses the notion of proxy models—which are simple models exhibiting the properties of a full complex model—to determine the best ML algorithm for training. Using a simple proxy model, algorithm selection is done very efficiently without loss of accuracy. No other database services for building machine learning models have this proxy modeling capability.
Intelligent Data Sampling: During model training, HeatWave ML samples a small percentage of the data in order to improve performance. This sampling is done in such a manner that all representative data points are captured in the sample data set. Other cloud services for building machine learning models take a less efficient approach—using random data sampling—which samples a small percentage of data without considering the data distribution characteristics.
Feature Selection: Feature selection helps determine the attributes of the training data which influence the machine learning model behavior for making predictions. The techniques in HeatWave ML for feature selection have been trained over a broad swath of data sets across multiple domains and applications. From these gathered statistics and meta information, HeatWave ML is able to efficiently identify the relevant features in a new data set.
In addition to machine learning capabilities, Oracle released more innovations to the MySQL HeatWave service. Real-time elasticity enables customers to upsize and downsize their HeatWave cluster to any number of nodes, without any downtime or read-only time, and without the need to manually rebalance the cluster. Also included is data compression, which enables customers to process twice the amount of data per node and lowers costs by nearly 50 percent, while maintaining the same price performance ratio. Finally, a new pause-and-resume function enables customers to pause HeatWave to save costs. Upon resuming, both the data and the statistics needed for MySQL Autopilot are automatically reloaded into HeatWave.
Astute Business Solutions is a leading Oracle Cloud MSP Partner. “We recently had an opportunity to use the machine learning capabilities of HeatWave ML. We found it very innovative, easy-to-use, very fast and most importantly, it is secure since the data or the model don’t leave the database,” said Arvind Rajan, Co-Founder and CEO of Astute Business Solutions. “We believe that providing in-database machine learning is of significant interest to our clients and will further accelerate the adoption of MySQL HeatWave.”
Estuda.com is an educational SaaS provider for K-12 student testing in Brasil. “MySQL HeatWave improved our complex query performance by 300X for responses in seconds and at 85 percent of the cost compared to Google BigQuery with no code changes. Now we can better deliver real-time analytics at a scale of three million users and continually improve our application to enhance student performance,” said Vitor Freitas, Co-founder and CTO, Estuda.com.
VRGlass is a Brasilian SaaS producer of metaverse apps and equipment for corporate clients. “Motivated by the progress achieved within the Oracle for Startups program, VRGlass migrated all application data to MySQL HeatWave from AWS EC2. Within three hours, we achieved a 5X increase in database performance for a virtual event that accommodated more than one million visitors and 1.7 million sessions with greater security and at half the cost,” said Ohmar Tacla, CEO, VRGlass.
Genius Sonority is video game designer, developer, and operator in Japan. “We found MySQL HeatWave improved performance by 90X, which solved all our challenges and concerns we had in moving data to realize real-time analysis. It was a big surprise for us. The extreme performance improvements help us to continually improve the gaming experience for joyful entertainment to customers around the world,” said Masayuki Kawamoto, Director, CTO, Genius Sonority.
Neovera is a trusted provider of managed cybersecurity solutions for more than 20 years. “MySQL HeatWave on OCI increased our query performance by 300X with an 80 percent TCO reduction compared to our on-premises MySQL database environment. Now we can get real-time analytical reporting within our OLTP database to accelerate enhancing our security application,” said Arman Rawls, Sr. Oracle Database Architect, Neovera Inc.
“Oracle announced MySQL HeatWave with Autopilot last August, which may very well have been the single greatest innovation in open source cloud databases in the last 20 years to that point,” said Carl Olofson, Research Vice President, Data Management Software, IDC. “Now Oracle has gone beyond its original unifying of OLTP and OLAP in HeatWave, with MySQL HeatWave ML. Oracle is bringing all of the machine learning processing and models inside the database, so that customers not only avoid managing ML databases apart from the core database, but also eliminate the hassles of ETL, gaining speed, accuracy, and cost-effectiveness in the bargain.”
* Benchmark queries are derived from TPC-DS benchmark, but results are not comparable to published TPC-DS benchmark results since they do not comply with TPC-DS specification.
Oracle offers integrated suites of applications plus secure, autonomous infrastructure in the Oracle Cloud. For more information about Oracle (NYSE: ORCL), please visit us at www.oracle.com.
Oracle, Java, and MySQL are registered trademarks of Oracle Corporation.