Utilizando o Hive e o Impala através do banco de dados Oracle

Por Yenugula Venkata Ravi Kumar (OCM) O ACE director, Adrian Neagu (OCM) e Alex Zaballa (OCM)O ACE director
Publicado em Janeiro 2018



Introdução

Atualmente a disseminação de tecnologias de Big Data nas organizações tornou-se um fato irrefutável. Com isso, ocorrem os problemas de integração com os sistemas legados.
Neste artigo iremos demonstrar a utilização do banco de dados Oracle com o Hive e o Impala.



Instalação e configuração


Ambiente Oracle

  • Oracle Virtual Box com Oracle Enterprise Linux 6.5
  • Oracle Database 12.1.0.2.0 Enterprise Edition
  • Hostname: Node1
  • IP: 192.168.1.20

Ambiente Cloudera

  • Oracle Virtual Box com CentOS 6.7
  • Cloudera 5.12
  • Hostname: Cloudera
  • IP: 192.168.1.88

Para configurar o ambiente Cloudera, usamos uma imagem Cloudera Quickstart que pode ser baixada em https://www.cloudera.com/downloads/quickstart_vms/5-12.html.

As tabelas Hive usadas neste artigo foram criadas de acordo com o tutorial que pode ser encontrado em http://www.cloudera.com/developers/get-started-with-hadoop-tutorial/exercise-1.html.

Os procedimentos para instalação do ambiente Oracle podem ser encontrados em https://docs.oracle.com/cd/E11882_01/gateways.112/e12013/toc.htm.




Instalando os drivers ODBC para o Cloudera Hive e Cloudera Impala


Como pré-requisito para instalar os drivers ODBC Cloudera Hive e Cloudera Impala precisamos utilizar o unixODBC ou o iODBC. Neste artigo optamos por utilizar o unixODBC:

[root@node1 etc]# yum install unixODBC
Loaded plugins: refresh-packagekit, security
..........................................................................................................
Setting up Install Process
Resolving Dependencies
--> Running transaction check
--> Package unixODBC.x86_64 0:2.2.14-14.el6 will be installed
--> Finished Dependency Resolution


Dependencies Resolved
=============================================================
Package                                Arch                                 Version                                     Repository                                Size

=============================================================
Installing:
unixODBC                         x86_64                            2.2.14-14.el6                                ol6_latest                                377 k


Transaction Summary
============================================================
Install       1 Package(s)

Total download size: 377 k
Installed size: 1.1 M
Is this ok [y/N]: y
Downloading Packages:
unixODBC-2.2.14-14.el6.x86_64.rpm 
…....................................................................................................................................................    


Faça o download do driver ODBC Cloudera Hive no seguinte endereço:
http://www.cloudera.com/downloads/connectors/hive/odbc/2-5-12.html

Utilizando o usuário root, execute o seguinte comando para instalar o driver ODBC Cloudera Hive:

[root@node1 kit]# rpm -Uhv ClouderaHiveODBC-2.5.19.1004-1.el6.x86_64.rpm
Preparing...                            ########################################### [100%]
   1:ClouderaHiveODBC       ########################################### [100%]
[root@node1 kit]#


Faça o download do driver ODBC Cloudera Impala no seguinte endereço:
(http://www.cloudera.com/downloads/connectors/impala/odbc/2-5-22.html)

Utilizando o usuário root, execute o seguinte comando para instalar o driver ODBC Cloudera Impala:

[root@node1 kit]# rpm -Uhv ClouderaImpalaODBC-2.5.33.1004-1.el6.x86_64.rpm
Preparing...                                                            ###################### [100%]
   1:ClouderaImpalaODBC                                   ###################### [100%]
[root@node1 kit]#




Configurando o driver ODBC Cloudera Hive


O diretório de instalação do driver está localizado em /opt/cloudera/hiveodbc. Em /opt/cloudera/hiveodbc/Setup encontramos arquivos de configuração de exemplo que podem ser usados como template.

Como root, crie o diretório /usr/local/odbc e copie o arquivo odbcinst.ini:
[root@node1 ~]# mkdir -p /usr/local/odbc
[root@node1 Setup]# cp odbcinst.ini /usr/local/odbc

Copie o arquivo odbc.ini para /home/oracle:
[root@node1 Setup]# cp odbc.ini /home/oracle

Copie o arquivo cloudera.hiveodbc.ini de /opt/cloudera/hiveodbc/lib/64 para o /etc:
[root@node1 64]# cp cloudera.hiveodbc.ini /etc


Agora vamos configurar as variáveis de ambiente para o driver ODBC Cloudera Hive.

Edite o arquivo /home/oracle/.bash_profile e adicione as seguintes variáveis: 

 [oracle@node1 ~]$ vi
#.bash_profile # .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs
export ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1
export ORACLE_SID=ORCL
export ODBCINI=~/odbc.ini
export ODBCSYSINI=/usr/local/odbc
export CLOUDERAHIVEINI=/etc/cloudera.hiveodbc.ini
PATH=$PATH:$HOME/bin:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$LIBRARY_PATH:$ORACLE_HOME/lib

Após isso, carregue as variáveis de ambiente com o seguinte comando:
[oracle@node1 ~]$ source .bash_profile


Iremos alterar o nome do data source para HIVEDSN e alterar apenas as variáveis host, port e UID:

[HiveDSN]
Description     = Cloudera ODBC Driver for Apache Hive (64-bit) DSN
Driver             = /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
HOST             = 192.168.1.88
PORT             = 10000
Schema           = default
ServiceDiscoveryMode                      = 0
ZKNamespace            =
HiveServerType         = 2
AuthMech                  = 2
ThriftTransport           = 1
UseNativeQuery         = 0
KrbHostFQDN          = [Hive Server 2 Host FQDN]
KrbServiceName        = [Hive Server 2 Kerberos service name]
KrbRealm                   = [Hive Server 2 Kerberos realm]
SSL                             = 0
TwoWaySSL              = 0
ClientCert                   =
ClientPrivateKey        =
ClientPrivateKeyPassword                =
UID=cloudera

 

Teste a conexão conforme exemplo abaixo:

[oracle@node1 ~]$ isql -v hivedsn
+---------------------------------------+
| Connected!                            |
|                                                |
| sql-statement                         |
| help [tablename]                   |
| quit                                        |
|                                                |
+---------------------------------------+
SQL> 


Liste as tabelas e faça um count na tabela orders para ter certeza que tudo está ok:

SQL> show tables;
+-----------------------------------+
| tab_name                                                                                                                                                                                                                                                       |+-----------------------------------------+
| categories
|| customers
 || departments
|| order_items
|| orders
|| products
|+------------------------------------------+

SQLRowCount returns -1
6 rows fetched

SQL> select count(*) from orders;
+---------------------+
| EXPR_1              |
+---------------------+
| 68883               |
+---------------------+
SQLRowCount returns -1
1 rows fetched




Configurando o Oracle Database Gateway para utilizar o driver ODBC Cloudera Hive


No diretório $ORACLE_HOME/hs/admin criaremos um arquivo de parâmetros chamado initHIVEDSN.ora (utilizaremos o arquivo dg4odbc.ini como template):

[oracle@node1 admin]$ vi  initHIVEDSN.ora 
  # This is a sample agent init file that  contains the HS parameters that are
  # needed for the Database Gateway for ODBC
  HS_FDS_CONNECT_INFO  = "HIVEDSN"
  HS_FDS_TRACE_LEVEL  = 0 
  HS_FDS_SHAREABLE_NAME  = /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
  # 
  # ODBC specific environment variables
  #
  set  ODBCINI=/home/oracle/odbc.ini 




Configuração do Listener


Abra o arquivo listener.ora e faça as seguintes alterações:

[oracle@node1 admin]$ vi  /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora 
  # listener.ora Network Configuration File:  
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora # Generated by Oracle configuration tools.
SID_LIST_LISTENER =
  (SID_LIST =
   (SID_DESC =
    (PROGRAM = dg4odbc)
    (ARGS =  ENVS=LD_LIBRARY_PATH=/opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so)
    (SID_NAME = HIVEDSN)
    (ORACLE_HOME =  /u01/app/oracle/product/12.1.0/dbhome_1)
   )
  ) 

LISTENER =
  (DESCRIPTION_LIST =
   (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST =  node1)(PORT = 1521))
   )
   (DESCRIPTION =
    (ADDRESS = (PROTOCOL = IPC)(KEY =  EXTPROC1521))
   )
  )
ADR_BASE_LISTENER = /u01/app/oracle


Adicione a entrada HIVEDSN ao arquivo tnsnames.ora:

HIVEDSN =
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521))
    (CONNECT_DATA=(SID=HIVEDSN))
    (HS=OK)
   )


Faça o reload do listener:

[oracle@node1 admin]$ lsnrctl reload

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:05:57
Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))
The command completed successfully
[oracle@node1 admin]$


Verifique o status do listener:

[oracle@node1 admin]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:06:32
Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))
STATUS of the LISTENER
-----------------------------------------------------------------------------------------------------------------------
Alias                           LISTENER
Version                        TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date                   19-JUN-2016 08:04:43
Uptime                        1 days 0 hr. 1 min. 49 sec
Trace Level                 off
Security                       ON: Local OS Authentication
SNMP                         OFF
Listener Parameter File   /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/node1/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=node1)(PORT=5500))
(Security=(my_wallet_directory=/u01/app/oracle/product/12.1.0/dbhome_1/admin/ORCL/xdb_wallet))
(Presentation=HTTP)(Session=RAW))

Services Summary...
Service "HIVEDSN" has 1 instance(s).
  Instance "HIVEDSN", status UNKNOWN, has 1 handler(s) for this service...
Service "ORCL" has 1 instance(s).
  Instance "ORCL", status READY, has 1 handler(s) for this service...
Service "ORCLXDB" has 1 instance(s).
  Instance "ORCL", status READY, has 1 handler(s) for this service...
The command completed successfully


Utilize o tnsping para verificar se a entrada HIVEDSN está funcionando corretamente:

[oracle@node1 admin]$ tnsping HIVEDSN

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:07:56
Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=node1)
(PORT=1521)) (CONNECT_DATA=(SID=HIVEDSN)) (HS=OK))

OK (10 msec)


Crie um database link público chamado hivedsn:

[oracle@node1 admin]$ sqlplus / as sysdba
SQL*Plus: Release 12.1.0.2.0 Production on Mon Jun 20 08:09:35 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create public database link hivedsn connect to cloudera identified by cloudera using 'HIVEDSN';
Database link created.


Agora tente buscar os dados utilizando o database link recém criado:

SQL> select * from customers@hivedsn;
select * from customers@hivedsn
                        *
ERROR at line 1:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Cloudera][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function.
{HY000,NativeErr = 11560}
ORA-02063: preceding 2 lines from HIVEDSN

Para resolver este problema, temos que adicionar o caminho completo para a lib libodbcinst.so em /etc/cloudera.hiveodbc.ini:

  [Driver]
  ODBCInstLib=/usr/lib64/libodbcinst.so
  ErrorMessagesPath=/opt/cloudera/hiveodbc/ErrorMessages
  LogLevel=0
  LogPath=
  SwapFilePath=/tmp 

Também exporte a variável LD_PRELOAD:

[oracle@node1 ~]$ export  LD_PRELOAD=/usr/lib64/libodbcinst.so

[oracle@node1 ~]$ ldd /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so
        linux-vdso.so.1 =>  (0x00007fffad5ff000)
            libdl.so.2 => /lib64/libdl.so.2 (0x00007fd1af2b2000)
            libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fd1af094000)
            libsasl2.so.2 => /usr/lib64/libsasl2.so.2 (0x00007fd1aee7a000)
librt.so.1 => /lib64/librt.so.1 (0x00007fd1aec72000)
            libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fd1aea57000)
            libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007fd1ae854000)
            libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fd1ae54e000)
libm.so.6 => /lib64/libm.so.6 (0x00007fd1ae2c9000)
            libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fd1ae0b3000)
            libc.so.6 => /lib64/libc.so.6 (0x00007fd1add20000)
            /lib64/ld-linux-x86-64.so.2 (0x00000033a1a00000)
            libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fd1adae8000)
            libfreebl3.so => /lib64/libfreebl3.so (0x00007fd1ad886000)


Faça o reload do listener e teste novamente:

[oracle@node1 admin]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jun 20 08:23:36 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select * from customers@hivedsn;
select * from customers@hivedsn
                        *
ERROR at line 1:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Cloudera][ODBC] (11470) Transactions are not supported. {HYC00,NativeErr =
11470}
ORA-02063: preceding 2 lines from HIVEDSN


Para resolver este problema, edite o arquivo initHIVEDSN.ora e adicione o parâmetro HS_TRANSACTION_MODEL:

  # This is a sample agent init file that contains the HS parameters that are
  # needed for the Database Gateway for ODBC
  HS_FDS_CONNECT_INFO = "HIVEDSN"
  HS_FDS_TRACE_LEVEL = 0
  HS_FDS_SHAREABLE_NAME =  /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
  HS_TRANSACTION_MODEL=READ_ONLY_AUTOCOMMIT
  #
  # ODBC specific environment variables
  #
  set ODBCINI=/home/oracle/odbc.ini

Agora teste novamente:

SQL> select * from customers@hivedsn
-------------------------------------------------------------------------------------------
1530 David        Smith          XXXXXXXXX        XXXXXXXXX                   



Configurando o driver ODBC Cloudera Impala


O diretório de instalação do driver está localizado em /opt/cloudera/impalaodbc. Em /opt/cloudera/impalaodbc/Setup encontraremos arquivos de configuração de exemplo que podem ser usados como template.

Como root, crie o diretório /usr/local/odbc e copie o arquivo odbcinst.ini:
[root@node1 ~]# mkdir -p /usr/local/odbc
[root@node1 Setup]# cp odbcinst.ini /usr/local/odbc

Copie o arquivo odbc.ini para /home/oracle:
[root@node1 Setup]# cp odbc.ini /home/oracle

Copie o arquivo cloudera.impalaodbc.inide /opt/cloudera/impalaodbc/lib/64 para o /etc:
[root@node1 64]# cp cloudera.impalaodbc.ini /etc


Agora vamos configurar as variáveis de ambiente para o driver ODBC Cloudera Impala.

Edite o arquivo /home/oracle/.bash_profile e adicione as seguintes variáveis: 

 [oracle@node1 ~]$ vi  .bash_profile 
  # .bash_profile
# Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi
# User specific environment and startup programs export ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1 export ORACLE_SID=ORCL export ODBCINI=~/odbc.ini export ODBCSYSINI=/usr/local/odbc export SIMBAINI=/etc/cloudera.impalaodbc.ini PATH=$PATH:$HOME/bin:$ORACLE_HOME/bin export LD_LIBRARY_PATH=$LIBRARY_PATH:$ORACLE_HOME/lib

Após isso, carregue as variáveis de ambiente com o seguinte comando:
[oracle@node1 ~]$ source .bash_profile


Iremos alterar o nome do data source para ImplDSN e alterar apenas as variáveis host e port:

[ImplDSN]

  # Description: DSN Description.
  # This key is not necessary and is only to  give a description of the data source.
  Description=Cloudera ODBC Driver for Impala  (64-bit) DSN
  # Driver: The location where the ODBC  driver is installed to.
  Driver=/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so
  HOST=192.168.1.88
  PORT=21050 
  Database=default
  AuthMech=0


Teste a conexão conforme exemplo abaixo:

[oracle@node1 ~]$ isql -v ImplDSN
+---------------------------------------+
| Connected!                   |
|                                       |
| sql-statement                |
| help [tablename]          |
| quit                               |
|                                       |
+---------------------------------------+
SQL>

Liste as tabelas e faça um count na tabela orders para ter certeza que tudo está ok:

SQL> show tables;
+------------------+
| tab_name
|+------------------+

| categories
|| customers
|| departments
|| order_items
|| orders
|| products
|+-------------------+

SQLRowCount returns -1
6 rows fetched
 
SQL> select count(*) from orders;
+---------------------+
| EXPR_1              |
+---------------------+
| 68883               |
+---------------------+
SQLRowCount returns -1
1 rows fetched



Configurando o Oracle Database Gateway para utilizar o driver ODBC Cloudera Impala


No diretório $ORACLE_HOME/hs/admin criaremos um arquivo de parâmetros chamado initIMPLDSN.ora (utilizaremos o arquivo dg4odbc.ini como template):

  [oracle@node1 admin]$ vi initIMPLDSN.ora
  # This is a sample agent init file that  contains the HS parameters that are
  # needed for the Database Gateway for ODBC
  HS_FDS_CONNECT_INFO  = "HIVEDSN"
  HS_FDS_TRACE_LEVEL  = 0 
  HS_FDS_SHAREABLE_NAME  = /opt/cloudera/impalaodbc/lib/64/libclouderahiveodbc64.so
  # 
  # ODBC specific environment variables
  #
  set  ODBCINI=/home/oracle/odbc.ini 

Modifique o arquivo /etc/cloudera.impalaodbc.ini conforme abaixo:

  [Driver]
  ODBCInstLib=/usr/lib64/libodbcinst.so 
  ErrorMessagesPath=/opt/cloudera/impalaodbc/ErrorMessages/
  LogLevel=0
  LogPath=
  SwapFilePath=/tmp 



Configuração do Listener


Abra o arquivo listener.ora e faça as seguintes alterações:

[oracle@node1 admin]$ vi  /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora 
  # listener.ora Network Configuration File:  /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
  # Generated by Oracle configuration tools.

SID_LIST_LISTENER =
 (SID_LIST =
  (SID_DESC =
   (PROGRAM = dg4odbc)
   (ARGS =  ENVS=LD_LIBRARY_PATH=/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so)
   (SID_NAME = IMPLDSN)
   (ORACLE_HOME =  /u01/app/oracle/product/12.1.0/dbhome_1)
  )
 )
LISTENER =
 (DESCRIPTION_LIST =
  (DESCRIPTION =
   (ADDRESS = (PROTOCOL = TCP)(HOST =  node1)(PORT = 1521))
  )
  (DESCRIPTION =
   (ADDRESS = (PROTOCOL = IPC)(KEY =  EXTPROC1521))
  )
 )

ADR_BASE_LISTENER = /u01/app/oracle


Adicione a entrada IMPLDSN ao arquivo tnsnames.ora:

IMPLDSN =
 (DESCRIPTION=
  (ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521))
   (CONNECT_DATA=(SID=IMPLDSN))
   (HS=OK)
  )


Faça o reload do listener:

[oracle@node1 admin]$ lsnrctl reload

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:05:57
Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))
The command completed successfully
[oracle@node1 admin]$


Verifique o status do listener:

[oracle@node1 admin]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:06:32
Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))
STATUS of the LISTENER
------------------------------------------------------------------------------------------------------------------------
Alias                           LISTENER
Version                        TNSLSNR for Linux: Version 12.1.0.2.0 - Production
Start Date                   19-JUN-2016 08:04:43
Uptime                        1 days 0 hr. 1 min. 49 sec
Trace Level                 off
Security                       ON: Local OS Authentication
SNMP                         OFF
Listener Parameter File   /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
Listener Log File             /u01/app/oracle/diag/tnslsnr/node1/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=node1)(PORT=5500))
(Security=(my_wallet_directory=/u01/app/oracle/product/12.1.0/dbhome_1/admin/ORCL/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "IMPLDSN" has 1 instance(s).
  Instance "IMPLDSN", status UNKNOWN, has 1 handler(s) for this service...
Service "ORCL" has 1 instance(s).
  Instance "ORCL", status READY, has 1 handler(s) for this service...
Service "ORCLXDB" has 1 instance(s).
  Instance "ORCL", status READY, has 1 handler(s) for this service...
The command completed successfully


Utilize o tnsping para verificar se a entrada IMPLDSN está funcionando corretamente:

[oracle@node1 admin]$ tnsping IMPLDSN
TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:07:56
Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=node1)
(PORT=1521)) (CONNECT_DATA=(SID=IMPLDSN)) (HS=OK))

OK (10 msec)


Crie um database link publico chamado impldsn:

[oracle@node1 admin]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jun 20 08:09:35 2016
Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create public database link impldsn connect to cloudera identified by cloudera using 'IMPLDSN';
Database link created.


Agora, tente buscar os dados utilizando o database link criado:

SQL> select * from customers@impldsn
-----------------------------------------------------------------------------------

1530 David     Smith              XXXXXXXXX        XXXXXXXXX




Observações


Devido a limitações na sintaxe e na velocidade, sempre que possível, use o Impala ao invés do Hive.
Para algumas queries complexas, é preferível criar uma view dentro do Hive ou Impala e acessar a view através do Oracle.

Por exemplo, se tentarmos executar a consulta abaixo, ocorrerá um erro de que o recurso não é suportado:

SQL> l
  1  select  c."category_name", count("order_item_quantity") as count
  2  from order_items@impaladsn oi
  3  inner join products@impaladsn p  on oi."order_item_product_id" = p."product_id"
  4  inner join categories@impaladsn  c on c."category_id" = p."product_category_id"
  5  group by  c."category_name"
  6*  order by count("order_item_quantity") desc
SQL> /
  select c."category_name",  count("order_item_quantity") as count
  *
  ERROR at line 1:
  ORA-02070: database IMPALADSN does not  support in this context

Para resolver esse problema, podemos criar uma view no Impala:

[cloudera@quickstart Desktop]$ impala-shell
Starting Impala Shell without Kerberos authentication
Connected to quickstart.cloudera:21000
Server version: impalad version 2.5.0-cdh5.7.0 RELEASE (build ad3f5adabedf56fe6bd9eea39147c067cc552703)
***********************************************************************************
Welcome to the Impala shell. Copyright (c) 2015 Cloudera, Inc. All rights reserved.
(Impala Shell v2.5.0-cdh5.7.0 (ad3f5ad) built on Wed Mar 23 11:33:33 PDT 2016)
After running a query, type SUMMARY to see a summary of where time was spent.
***********************************************************************************
[quickstart.cloudera:21000] >
create view summaryvw as -- Most popular product categories
> select c.category_name, count(order_item_quantity) as count
> from order_items oi
> inner join products p on oi.order_item_product_id = p.product_id
> inner join categories c on c.category_id = p.product_category_id
> group by c.category_name
> order by count desc
> limit 10;


Agora podemos executar a query diretamente sobre a view sumaryvw:

  1* select * from summaryvw@impaladsn
SQL> /

category_name                                                                                  count
-------------------------------------------------------------------------------- ----------
Cleats                                                                                                24551
Men's Footwear                                                                                22246
Women's Apparel                                                                              21035
Indoor/Outdoor Games                                                                     19298
Fishing                                                                                              17325
Water Sports                                                                                    15540
Camping & Hiking                                                                           13729
Cardio Equipment                                                                            12487
Shop By Sport                                                                                  10984
Electronics                                                                                       3156

10 rows selected.



Yenugula Venkata Ravikumar é um DBA com mais de 18 anos de experiencia com Oracle e em ambientes de alta disponibilidade (RAC, Data Guard, dentre outros), tuning e desempenho, migrações, backup e recover, Oracle Exadata X2 e X3, é Expert em sistemas operacionais tais como como AIX, HP-UX e Linux. Já participou como conferencista de Oracle pela India, onde mora atualmente. Obteve o título de "Oracle Certified Master (OCM 10g)" em 2009.

Adrian Neagu é um DBA com mais de 15 anos de experiência em diversas tecnologias de banco de dados.
Ele é Oracle Certified Master 10g e 11g, Oracle Certified Professional 9i, 10g, e 11g, Cloudera Certified Administrator for Apache Hadoop, IBM DB2 Certified Administrator version 8.1.2 e 9, IBM DB2 9 Advanced Certified Administrator 9, e Sun Certified System Administrator Solaris 10.

Alex Zaballa, formado em Análise de Sistemas, é especialista em Banco de Dados Oracle com sólidos conhecimentos em Servidores de Aplicação e Sistemas Operacionais; trabalha com Oracle há 17 anos, é Oracle ACE Director, certificado OCM Database 12c/MAA/11G/Cloud e conta com mais de 235 outras certificações em produtos da Oracle. Alex também é membro do Groupo de Usuários Oracle do Brasil (GUOB), fundador do Grupo de Usuários Oracle de Angola (GUOA) e membro do time OraWorld.

Este artigo foi revisto pela equipe de produtos Oracle e está em conformidade com as normas e práticas para o uso de produtos Oracle.