Oracle R Advanced Analytics for Hadoop Logo Oracle R Advanced Analytics for Hadoop Icon

Oracle R Advanced Analytics for Hadoop (ORAAH) is one of the components in the Oracle Big Data Software Connectors Suite, an option to the Big Data Appliance. At its core, ORAAH provides an R interface for not only manipulating HDFS data, but writing mapper and reducer functions in R – where you can also leverage open source CRAN packages – and then invoke those Hadoop jobs from R. Users can pass R objects from the client R object space to their mapper and reducer functions, as well as test MapReduce jobs locally at their client R engine without changing any code, just switching a system flag. This makes it easy to debug code before unleashing it on the full Hadoop cluster.

If parallel distributed map-reduce programming isn't your strength, ORAAH also allows you to manipulate Hive data using the same type of transparency provided by Oracle R Enteprise, but for use on top of Hive tables. So just as Oracle R Enterprise maps data.frame functions to Oracle SQL, Oracle R Advanced Analytics for Hadoop uses the same abstraction to map those data.frame functions to HiveQL.

In addition, ORAAH provides eight prepackaged advanced analytics algorithms including: KMeans clustering, linear regression models, principal component analysis or PCA, Non-negative and low rank matrix factorization, correlation and covariance matrix computations, and feed forward neural networks. So even if you’re not comfortable turning serial algorithms into parallel distributed algorithms in map-reduce, you can get the benefit of the Hadoop cluster using our high-level R interface.


Oracle has a very active research organization (Oracle Labs) that is charged to 'Identify, explore, and transfer new technologies that have the potential to substantially improve Oracle's business'. One part of the organization is the External Research Office (ERO). The ERO is charged to ' ... invest in research collaborations that fit Oracle's long-term strategic goals. These collaborations are between university researchers and engineers/researchers throughout Oracle's various organizations'. The ERO webpage lists numerous current and past collaborations. Oracle provides funds and direct interactions with highly experienced developers.

If you are interested in the ERO program please contact Steve Jeffreys at

If you would like to explore opportunities for a research collaboration with the database team please contact Dieter Gawlick at

or Garret Swart at
Oracle Database Cloud