Big Data Connectors

Oracle Loader
for Hadoop

Oracle SQL Connector for HDFS

Extreme load performance from Hadoop to Oracle Database.


Oracle Loader for Hadoop and Oracle SQL Connector for HDFS enable high-speed data loading from many Hadoop systems into Oracle Database. On Oracle engineered systems, up to 15 TBs per hour can be transferred from Oracle Big Data Appliance into Oracle Exadata Database machine. The load is highly efficient, using very few database CPU cycles.

Oracle SQL Connector for HDFS can be used to copy data from Hadoop and to query data in Hadoop files using Oracle SQL.

Oracle Loader for Hadoop

Oracle Loader for Hadoop (OLH) is a MapReduce utility to optimize data loading from Hadoop into Oracle Database. It sorts, partitions, and converts data into Oracle Database formats on the Hadoop cluster. It then loads the converted data into the database. By preprocessing the data on the Hadoop cluster, Oracle Loader for Hadoop reduces the CPU and IO utilization on the database.

Oracle Loader for Hadoop has online and offline options. Both load the sorted and transformed data in parallel into the correct partition in the database.

Oracle SQL Connector for Hadoop Distributed File System (HDFS)

Oracle SQL Connector for HDFS is a high-speed connector for accessing data on HDFS directly from Oracle Database. It gives users the flexibility to access and import data from HDFS at any time, as needed by the application.

This connector uses an external table in Oracle Database to provide direct SQL access to data stored in HDFS. The data can be in delimited files or in Oracle data pump files created by Oracle Loader for Hadoop.


Oracle has a very active research organization (Oracle Labs) that is charged to 'Identify, explore, and transfer new technologies that have the potential to substantially improve Oracle's business'. One part of the organization is the External Research Office (ERO). The ERO is charged to ' ... invest in research collaborations that fit Oracle's long-term strategic goals. These collaborations are between university researchers and engineers/researchers throughout Oracle's various organizations'. The ERO webpage lists numerous current and past collaborations. Oracle provides funds and direct interactions with highly experienced developers.

If you are interested in the ERO program please contact Steve Jeffreys at

If you would like to explore opportunities for a research collaboration with the database team please contact Dieter Gawlick at

or Garret Swart at
Oracle Database Cloud