Oracle Loader for Hadoop and Oracle SQL Connector for HDFS enable high-speed data loading from many Hadoop systems into Oracle Database. On Oracle engineered systems, up to 15 TBs per hour can be transferred from Oracle Big Data Appliance into Oracle Exadata Database machine. The load is highly efficient, using very few database CPU cycles.
Oracle SQL Connector for HDFS can be used to copy data from Hadoop and to query data in Hadoop files using Oracle SQL.
Oracle Loader for Hadoop
Oracle Loader for Hadoop (OLH) is a MapReduce utility to optimize data loading from Hadoop into Oracle Database. It sorts, partitions, and converts data into Oracle Database formats on the Hadoop cluster. It then loads the converted data into the database. By preprocessing the data on the Hadoop cluster, Oracle Loader for Hadoop reduces the CPU and IO utilization on the database.
Oracle Loader for Hadoop has online and offline options. Both load the sorted and transformed data in parallel into the correct partition in the database.
Oracle SQL Connector for Hadoop Distributed File System (HDFS)
Oracle SQL Connector for HDFS is a high-speed connector for accessing data on HDFS directly from Oracle Database. It gives users the flexibility to access and import data from HDFS at any time, as needed by the application.
This connector uses an external table in Oracle Database to provide direct SQL access to data stored in HDFS. The data can be in delimited files or in Oracle data pump files created by Oracle Loader for Hadoop.