NEW ORAAH RELEASE 2.7.0: Introducing the fastest GLM and LM algorithms on Spark with full summary, enhanced Deep Neural Networks and support for Spark MLlib Gaussian Mixture Models.
The latest release of Oracle R Advanced Analytics for Hadoop (ORAAH), release 2.7.0, is one of the components of the Oracle Big Data Connectors software suite, an option to the Oracle Big Data Appliance. At its core, ORAAH provides an R interface for manipulating data stored in HDFS, using both HIVE transparency capabilities and mapping HDFS as direct input into Machine Learning algorithms that can run as Map Reduce jobs or inside an Apache Spark container.
New to the release 2.7.0 are updated ORAAH GLM and LM algorithms which are much faster, stable and light on memory than comparable GLM and LM methods from Spark MLlib. Both methods also bring a new summary feature that makes them comparable to solutions from open-source R glm and lm, but capable of handling Big Data at enterprise scale.
The Neural Networks algorithm has been enhanced to support the full formula processing and a full build and scoring in Spark.
The new Gaussian Mixture Models is an addition to the set of algorithms supported in Spark MLlib.
Our new functionality and high performance for ORAAH's own algorithm.
Sample performance of both algorithms against the same Spark MLlib algorithms on the same hardware and same Spark settings.
Set of Apache Spark MLlib algorithms available from R in ORAAH (source data can be CSVs in HDFS or HIVE tables)
In addition to these new interfaces to Spark MLlib algorithms, ORAAH provides eight prepackaged algorithms, including: