Utilizando o Hive e o Impala através do banco de dados Oracle

Por Yenugula Venkata Ravi Kumar (OCM), Adrian Neagu (OCM) e Alex Zaballa (OCM),
Publicado em Janeiro 2018

Introdução

Atualmente a disseminação de tecnologias de Big Data nas organizações tornou-se um fato irrefutável. Com isso, ocorrem os problemas de integração com os sistemas legados. Neste artigo iremos demonstrar a utilização do banco de dados Oracle com o Hive e o Impala.

Instalação e configuração

Ambiente Oracle

  • Oracle Virtual Box com Oracle Enterprise Linux 6.5
  • Oracle Database 12.1.0.2.0 Enterprise Edition
  • Hostname: Node1
  • IP: 192.168.1.20

Ambiente Cloudera

  • Oracle Virtual Box com CentOS 6.7
  • Cloudera 5.12
  • Hostname: Cloudera
  • IP: 192.168.1.88

Para configurar o ambiente Cloudera, usamos uma imagem Cloudera Quickstart que pode ser baixada em https://www.cloudera.com/downloads/quickstart_vms/5-12.html.

As tabelas Hive usadas neste artigo foram criadas de acordo com o tutorial que pode ser encontrado em .

Os procedimentos para instalação do ambiente Oracle podem ser encontrados em https://docs.oracle.com/cd/E11882_01/gateways.112/e12013/toc.htm.

Instalando os drivers ODBC para o Cloudera Hive e Cloudera Impala

Como pré-requisito para instalar os drivers ODBC Cloudera Hive e Cloudera Impala precisamos utilizar o unixODBC ou o iODBC. Neste artigo optamos por utilizar o unixODBC:

[root@node1 etc]# yum install unixODBC

Loaded plugins: refresh-packagekit, security

..........................................................................................................

Setting up Install Process

Resolving Dependencies

--> Running transaction check

--> Package unixODBC.x86_64 0:2.2.14-14.el6 will be installed

--> Finished Dependency Resolution

Dependencies Resolved

=============================================================

Package                                Arch                                 Version   

                                  Repository                                Size

=============================================================

Installing:

unixODBC                         x86_64                            2.2.14-14.el6  

 

                             ol6_latest                                377 k

Transaction Summary

============================================================

Install       1 Package(s)

Total download size: 377 k

Installed size: 1.1 M

Is this ok [y/N]: y

Downloading Packages:

unixODBC-2.2.14-14.el6.x86_64.rpm 

…....................................................................................................................................................    

Faça o download do driver ODBC Cloudera Hive no seguinte endereço: http://www.cloudera.com/downloads/connectors/hive/odbc/2-5-12.html

Utilizando o usuário root, execute o seguinte comando para instalar o driver ODBC Cloudera Hive:

[root@node1 kit]# rpm -Uhv ClouderaHiveODBC-2.5.19.1004-1.el6.x86_64.rpm

Preparing...                            ########################################### [100%]

   1:ClouderaHiveODBC       ########################################### [100%]

[root@node1 kit]#

Faça o download do driver ODBC Cloudera Impala no seguinte endereço: (http://www.cloudera.com/downloads/connectors/impala/odbc/2-5-22.html)

Utilizando o usuário root, execute o seguinte comando para instalar o driver ODBC Cloudera Impala:

[root@node1 kit]# rpm -Uhv ClouderaImpalaODBC-2.5.33.1004-1.el6.x86_64.rpm

Preparing...                                                            ###################### [100%]

   1:ClouderaImpalaODBC                                   ###################### [100%]

[root@node1 kit]#

Configurando o driver ODBC Cloudera Hive

O diretório de instalação do driver está localizado em /opt/cloudera/hiveodbc. Em /opt/cloudera/hiveodbc/Setup encontramos arquivos de configuração de exemplo que podem ser usados como template.

Como root, crie o diretório /usr/local/odbc e copie o arquivo odbcinst.ini:

[root@node1 ~]# mkdir -p /usr/local/odbc

[root@node1 Setup]# cp odbcinst.ini /usr/local/odbc

Copie o arquivo odbc.ini para /home/oracle:

[root@node1 Setup]# cp odbc.ini /home/oracle

Copie o arquivo cloudera.hiveodbc.ini de /opt/cloudera/hiveodbc/lib/64 para o /etc:

[root@node1 64]# cp cloudera.hiveodbc.ini /etc

Agora vamos configurar as variáveis de ambiente para o driver ODBC Cloudera Hive.

Edite o arquivo /home/oracle/.bash_profile e adicione as seguintes variáveis: 

 [oracle@node1 ~]$ vi

#.bash_profile # .bash_profile

# Get the aliases and functions

if [ -f ~/.bashrc ]; then

. ~/.bashrc

fi

# User specific environment and startup programs

export ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1

export ORACLE_SID=ORCL

export ODBCINI=~/odbc.ini

export ODBCSYSINI=/usr/local/odbc

export CLOUDERAHIVEINI=/etc/cloudera.hiveodbc.ini

PATH=$PATH:$HOME/bin:$ORACLE_HOME/bin

export LD_LIBRARY_PATH=$LIBRARY_PATH:$ORACLE_HOME/lib

Após isso, carregue as variáveis de ambiente com o seguinte comando:

[oracle@node1 ~]$ source .bash_profile

Iremos alterar o nome do data source para HIVEDSN e alterar apenas as variáveis host, port e UID:

[HiveDSN]

Description     = Cloudera ODBC Driver for Apache Hive (64-bit) DSN

Driver             = /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so

HOST             = 192.168.1.88

PORT             = 10000

Schema           = default

ServiceDiscoveryMode                      = 0

ZKNamespace            =

HiveServerType         = 2

AuthMech                  = 2

ThriftTransport           = 1

UseNativeQuery         = 0

KrbHostFQDN          = [Hive Server 2 Host FQDN]

KrbServiceName        = [Hive Server 2 Kerberos service name]

KrbRealm                   = [Hive Server 2 Kerberos realm]

SSL                             = 0

TwoWaySSL              = 0

ClientCert                   =

ClientPrivateKey        =

ClientPrivateKeyPassword                =

UID=cloudera

Teste a conexão conforme exemplo abaixo:

[oracle@node1 ~]$ isql -v hivedsn

+---------------------------------------+

| Connected!                            |

|                                                |

| sql-statement                         |

| help [tablename]                   |

| quit                                        |

|                                                |

+---------------------------------------+

SQL> 

Liste as tabelas e faça um count na tabela orders para ter certeza que tudo está ok:

SQL> show tables;

+-----------------------------------+

|

tab_name  

                                                                                                                                                                                                                                                    

|+-----------------------------------------+

| categories

|| customers

|| departments

|| order_items

|| orders

|| products

|+------------------------------------------+

SQLRowCount returns -1

6 rows fetched

SQL> select count(*) from orders;

+---------------------+

| EXPR_1              |

+---------------------+

| 68883               |

+---------------------+

SQLRowCount returns -1

1 rows fetched

Configurando o Oracle Database Gateway para utilizar o driver ODBC Cloudera Hive

No diretório $ORACLE_HOME/hs/admin criaremos um arquivo de parâmetros chamado initHIVEDSN.ora (utilizaremos o arquivo dg4odbc.ini como template):


  
[oracle@node1 admin]$ vi  initHIVEDSN.ora 
  # This is a sample agent init file that  contains the HS parameters that are
  # needed for the Database Gateway for ODBC
  HS_FDS_CONNECT_INFO  = "HIVEDSN"
  HS_FDS_TRACE_LEVEL  = 0 
  HS_FDS_SHAREABLE_NAME  = /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
  # 
  # ODBC specific environment variables
  #
  set  ODBCINI=/home/oracle/odbc.ini 
 
 

Configuração do Listener

Abra o arquivo listener.ora e faça as seguintes alterações:


  
[oracle@node1 admin]$ vi  /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora 
  # listener.ora Network Configuration File:  
/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
  # Generated by Oracle configuration tools.
SID_LIST_LISTENER =
  (SID_LIST =
   (SID_DESC =
    (PROGRAM = dg4odbc)
    (ARGS =  ENVS=LD_LIBRARY_PATH=/opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so)
    (SID_NAME = HIVEDSN)
    (ORACLE_HOME =  /u01/app/oracle/product/12.1.0/dbhome_1)
   )
  ) 

LISTENER =
  (DESCRIPTION_LIST =
   (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST =  node1)(PORT = 1521))
   )
   (DESCRIPTION =
    (ADDRESS = (PROTOCOL = IPC)(KEY =  EXTPROC1521))
   )
  )

  ADR_BASE_LISTENER  = /u01/app/oracle
 
 

Adicione a entrada HIVEDSN ao arquivo tnsnames.ora:


  
HIVEDSN =
  (DESCRIPTION=
   (ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521))
    (CONNECT_DATA=(SID=HIVEDSN))
    (HS=OK)
   )
 
 

Faça o reload do listener:

[oracle@node1 admin]$ lsnrctl reload

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:05:57

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))

The command completed successfully

[oracle@node1 admin]$

Verifique o status do listener:

[oracle@node1 admin]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:06:32

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))

STATUS of the LISTENER

-----------------------------------------------------------------------------------------------------------------------

Alias                           LISTENER

Version                        TNSLSNR for Linux: Version 12.1.0.2.0 - Production

Start Date                   19-JUN-2016 08:04:43

Uptime                        1 days 0 hr. 1 min. 49 sec

Trace Level                 off

Security                       ON: Local OS Authentication

SNMP                         OFF

Listener Parameter File   /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora

Listener Log File         /u01/app/oracle/diag/tnslsnr/node1/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=node1)(PORT=5500))

(Security=(my_wallet_directory=/u01/app/oracle/product/12.1.0/dbhome_1/admin/ORCL/xdb_wallet)) (Presentation=HTTP)(Session=RAW))

Services Summary...

Service "HIVEDSN" has 1 instance(s).

  Instance "HIVEDSN", status UNKNOWN, has 1 handler(s) for this service...

Service "ORCL" has 1 instance(s).

  Instance "ORCL", status READY, has 1 handler(s) for this service...

Service "ORCLXDB" has 1 instance(s).

  Instance "ORCL", status READY, has 1 handler(s) for this service...

The command completed successfully

Utilize o tnsping para verificar se a entrada HIVEDSN está funcionando corretamente:

[oracle@node1 admin]$ tnsping HIVEDSN

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:07:56

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:

/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used TNSNAMES adapter to resolve the alias

Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=node1)

(PORT=1521)) (CONNECT_DATA=(SID=HIVEDSN)) (HS=OK))

OK (10 msec)

Crie um database link público chamado hivedsn:

[oracle@node1 admin]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jun 20 08:09:35 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create public database link hivedsn connect to cloudera identified by cloudera using 'HIVEDSN';

Database link created.

Agora tente buscar os dados utilizando o database link recém criado:

SQL> select * from customers@hivedsn;

select * from customers@hivedsn

                        *

ERROR at line 1:

ORA-28500: connection from ORACLE to a non-Oracle system returned this message:

[Cloudera][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function.

{HY000,NativeErr = 11560}

ORA-02063: preceding 2 lines from HIVEDSN

Para resolver este problema, temos que adicionar o caminho completo para a lib libodbcinst.so em /etc/cloudera.hiveodbc.ini:


  
 [Driver]
  ODBCInstLib=/usr/lib64/libodbcinst.so
  ErrorMessagesPath=/opt/cloudera/hiveodbc/ErrorMessages
  LogLevel=0
  LogPath=
  SwapFilePath=/tmp 
 
 

Também exporte a variável LD_PRELOAD:


  
[oracle@node1 ~]$ export  LD_PRELOAD=/usr/lib64/libodbcinst.so
 
 

[oracle@node1 ~]$ ldd /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so

        linux-vdso.so.1 =>  (0x00007fffad5ff000)

            libdl.so.2 => /lib64/libdl.so.2 (0x00007fd1af2b2000)

            libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fd1af094000)

            libsasl2.so.2 => /usr/lib64/libsasl2.so.2 (0x00007fd1aee7a000)

librt.so.1 => /lib64/librt.so.1 (0x00007fd1aec72000)

            libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fd1aea57000)

            libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007fd1ae854000)

            libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fd1ae54e000)

libm.so.6 => /lib64/libm.so.6 (0x00007fd1ae2c9000)

            libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fd1ae0b3000)

            libc.so.6 => /lib64/libc.so.6 (0x00007fd1add20000)

            /lib64/ld-linux-x86-64.so.2 (0x00000033a1a00000)

            libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fd1adae8000)

            libfreebl3.so => /lib64/libfreebl3.so (0x00007fd1ad886000)

Faça o reload do listener e teste novamente:

[oracle@node1 admin]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jun 20 08:23:36 2016 Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> select * from customers@hivedsn; select * from customers@hivedsn                         * ERROR at line 1: ORA-28500: connection from ORACLE to a non-Oracle system returned this message: [Cloudera][ODBC] (11470) Transactions are not supported. {HYC00,NativeErr = 11470} ORA-02063: preceding 2 lines from HIVEDSN

Para resolver este problema, edite o arquivo initHIVEDSN.ora e adicione o parâmetro HS_TRANSACTION_MODEL:


  
 # This is a sample agent init file that contains the HS parameters that are
  # needed for the Database Gateway for ODBC
  HS_FDS_CONNECT_INFO = "HIVEDSN"
  HS_FDS_TRACE_LEVEL = 0
  HS_FDS_SHAREABLE_NAME =  /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
  HS_TRANSACTION_MODEL=READ_ONLY_AUTOCOMMIT
  #
  # ODBC specific environment variables
  #
  set ODBCINI=/home/oracle/odbc.ini
 
 

Agora teste novamente:

SQL> select * from customers@hivedsn

-------------------------------------------------------------------------------------------

1530 David        Smith          XXXXXXXXX        XXXXXXXXX                   

Configurando o driver ODBC Cloudera Impala

O diretório de instalação do driver está localizado em /opt/cloudera/impalaodbc. Em 

/opt/cloudera/impalaodbc/Setup encontraremos arquivos de configuração de exemplo que podem ser usados como template.

Como root, crie o diretório /usr/local/odbc e copie o arquivo odbcinst.ini:

[root@node1 ~]# mkdir -p /usr/local/odbc

[root@node1 Setup]# cp odbcinst.ini /usr/local/odbc

Copie o arquivo odbc.ini para /home/oracle:

[root@node1 Setup]# cp odbc.ini /home/oracle

Copie o arquivo cloudera.impalaodbc.inide /opt/cloudera/impalaodbc/lib/64 para o /etc:

[root@node1 64]# cp cloudera.impalaodbc.ini /etc

Agora vamos configurar as variáveis de ambiente para o driver ODBC Cloudera Impala.

Edite o arquivo /home/oracle/.bash_profile e adicione as seguintes variáveis: 


  
[oracle@node1 ~]$ vi  .bash_profile 
  # .bash_profile

  # Get the aliases and functions
  if [ -f ~/.bashrc ]; then
  .  ~/.bashrc
  fi

  # User specific environment and startup  programs
  export  ORACLE_HOME=/u01/app/oracle/product/12.1.0/dbhome_1
  export ORACLE_SID=ORCL
  export ODBCINI=~/odbc.ini
  export ODBCSYSINI=/usr/local/odbc
  export SIMBAINI=/etc/cloudera.impalaodbc.ini 
  PATH=$PATH:$HOME/bin:$ORACLE_HOME/bin
  export  LD_LIBRARY_PATH=$LIBRARY_PATH:$ORACLE_HOME/lib
 
 

Após isso, carregue as variáveis de ambiente com o seguinte comando:

[oracle@node1 ~]$ source .bash_profile

Iremos alterar o nome do data source para ImplDSN e alterar apenas as variáveis host e port:

[ImplDSN]


  
 # Description: DSN Description.
  # This key is not necessary and is only to  give a description of the data source.
  Description=Cloudera ODBC Driver for Impala  (64-bit) DSN
  # Driver: The location where the ODBC  driver is installed to.
  Driver=/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so
  HOST=192.168.1.88
  PORT=21050 
  Database=default
  AuthMech=0
 
 

Teste a conexão conforme exemplo abaixo:

[oracle@node1 ~]$ isql -v ImplDSN

+---------------------------------------+

| Connected!                   |

|                                       |

| sql-statement                |

| help [tablename]          |

| quit                               |

|                                       |

+---------------------------------------+

SQL>

Liste as tabelas e faça um count na tabela orders para ter certeza que tudo está ok:

SQL> show tables;

+------------------+

| tab_name |+------------------+

| categories || customers || departments || order_items || orders || products |+-------------------+

SQLRowCount returns -1

6 rows fetched

SQL> select count(*) from orders;

+---------------------+

| EXPR_1              |

+---------------------+

| 68883               |

+---------------------+

SQLRowCount returns -1

1 rows fetched

Configurando o Oracle Database Gateway para utilizar o driver ODBC Cloudera Impala

No diretório $ORACLE_HOME/hs/admin criaremos um arquivo de parâmetros chamado initIMPLDSN.ora (utilizaremos o arquivo dg4odbc.ini como template):


  
[oracle@node1 admin]$ vi initIMPLDSN.ora
  # This is a sample agent init file that  contains the HS parameters that are
  # needed for the Database Gateway for ODBC
  HS_FDS_CONNECT_INFO  = "HIVEDSN"
  HS_FDS_TRACE_LEVEL  = 0 
  HS_FDS_SHAREABLE_NAME  = /opt/cloudera/impalaodbc/lib/64/libclouderahiveodbc64.so
  # 
  # ODBC specific environment variables
  #
  set  ODBCINI=/home/oracle/odbc.ini 
 
 

Modifique o arquivo /etc/cloudera.impalaodbc.ini conforme abaixo:


  
[Driver]
  ODBCInstLib=/usr/lib64/libodbcinst.so 
  ErrorMessagesPath=/opt/cloudera/impalaodbc/ErrorMessages/
  LogLevel=0
  LogPath=
  SwapFilePath=/tmp 
 
 

Configuração do Listener

Abra o arquivo listener.ora e faça as seguintes alterações:


  
[oracle@node1 admin]$ vi  /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora 
  # listener.ora Network Configuration File:  /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora
  # Generated by Oracle configuration tools.

SID_LIST_LISTENER =
 (SID_LIST =
  (SID_DESC =
   (PROGRAM = dg4odbc)
   (ARGS =  ENVS=LD_LIBRARY_PATH=/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so)
   (SID_NAME = IMPLDSN)
   (ORACLE_HOME =  /u01/app/oracle/product/12.1.0/dbhome_1)
  )
 )
LISTENER =
 (DESCRIPTION_LIST =
  (DESCRIPTION =
   (ADDRESS = (PROTOCOL = TCP)(HOST =  node1)(PORT = 1521))
  )
  (DESCRIPTION =
   (ADDRESS = (PROTOCOL = IPC)(KEY =  EXTPROC1521))
  )
 )
 
 

ADR_BASE_LISTENER = /u01/app/oracle

Adicione a entrada IMPLDSN ao arquivo tnsnames.ora:


  
IMPLDSN =
 (DESCRIPTION=
  (ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521))
   (CONNECT_DATA=(SID=IMPLDSN))
   (HS=OK)
  )
 
 

Faça o reload do listener:

[oracle@node1 admin]$ lsnrctl reload

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:05:57

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))

The command completed successfully

[oracle@node1 admin]$

Verifique o status do listener:

[oracle@node1 admin]$ lsnrctl status

LSNRCTL for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:06:32

Copyright (c) 1991, 2014, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=node1)(PORT=1521)))

STATUS of the LISTENER

------------------------------------------------------------------------------------------------------------------------

Alias                           LISTENER

Version                        TNSLSNR for Linux: Version 12.1.0.2.0 - Production

Start Date                   19-JUN-2016 08:04:43

Uptime                        1 days 0 hr. 1 min. 49 sec

Trace Level                 off

Security                       ON: Local OS Authentication

SNMP                         OFF

Listener Parameter File   /u01/app/oracle/product/12.1.0/dbhome_1/network/admin/listener.ora

Listener Log File             /u01/app/oracle/diag/tnslsnr/node1/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=node1)(PORT=5500))

(Security=(my_wallet_directory=/u01/app/oracle/product/12.1.0/dbhome_1/admin/ORCL/xdb_wallet))(Presentation=HTTP)(Session=RAW))

Services Summary...

Service "IMPLDSN" has 1 instance(s).

  Instance "IMPLDSN", status UNKNOWN, has 1 handler(s) for this service...

Service "ORCL" has 1 instance(s).

  Instance "ORCL", status READY, has 1 handler(s) for this service...

Service "ORCLXDB" has 1 instance(s).

  Instance "ORCL", status READY, has 1 handler(s) for this service...

The command completed successfully

Utilize o tnsping para verificar se a entrada IMPLDSN está funcionando corretamente:

[oracle@node1 admin]$ tnsping IMPLDSN

TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 20-JUN-2016 08:07:56

Copyright (c) 1997, 2014, Oracle.  All rights reserved.

Used parameter files:

/u01/app/oracle/product/12.1.0/dbhome_1/network/admin/sqlnet.ora

Used TNSNAMES adapter to resolve the alias

Attempting to contact (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=node1)

(PORT=1521)) (CONNECT_DATA=(SID=IMPLDSN)) (HS=OK))

OK (10 msec)

Crie um database link publico chamado impldsn:

[oracle@node1 admin]$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Mon Jun 20 08:09:35 2016

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

SQL> create public database link impldsn connect to cloudera identified by cloudera using 'IMPLDSN';

Database link created.

Agora, tente buscar os dados utilizando o database link criado:

SQL> select * from customers@impldsn

-----------------------------------------------------------------------------------

1530 David     Smith              XXXXXXXXX        XXXXXXXXX

Observações

Devido a limitações na sintaxe e na velocidade, sempre que possível, use o Impala ao invés do Hive. Para algumas queries complexas, é preferível criar uma view dentro do Hive ou Impala e acessar a view através do Oracle.

Por exemplo, se tentarmos executar a consulta abaixo, ocorrerá um erro de que o recurso não é suportado:


  
SQL> l
  1  select  c."category_name", count("order_item_quantity") as count
  2  from order_items@impaladsn oi
  3  inner join products@impaladsn p  on oi."order_item_product_id" = p."product_id"
  4  inner join categories@impaladsn  c on c."category_id" = p."product_category_id"
  5  group by  c."category_name"
  6*  order by count("order_item_quantity") desc
SQL> /
  select c."category_name",  count("order_item_quantity") as count
  *
  ERROR at line 1:
  ORA-02070: database IMPALADSN does not  support in this context
 
 

Para resolver esse problema, podemos criar uma view no Impala:


  
 [cloudera@quickstart Desktop]$ impala-shell
  Starting Impala Shell without Kerberos authentication
  Connected to quickstart.cloudera:21000
  Server version: impalad version 2.5.0-cdh5.7.0 RELEASE (build ad3f5adabedf56fe6bd9eea39147c067cc552703)
  ***********************************************************************************
  Welcome to the Impala shell. Copyright (c) 2015 Cloudera, Inc. All rights reserved.
  (Impala Shell v2.5.0-cdh5.7.0 (ad3f5ad) built on Wed Mar 23 11:33:33 PDT 2016)
  After running a query, type SUMMARY to see a summary of where time was spent.
  ***********************************************************************************
  [quickstart.cloudera:21000] >
  create view summaryvw as -- Most popular product categories
  > select c.category_name, count(order_item_quantity) as count
  > from order_items oi
  > inner join products p on oi.order_item_product_id = p.product_id
  > inner join categories c on c.category_id = p.product_category_id
  > group by c.category_name
  > order by count desc
  > limit 10; 
 
 

Agora podemos executar a query diretamente sobre a view sumaryvw:

  1* select * from summaryvw@impaladsn SQL> /


  
category_name                                                                                  count
  -------------------------------------------------------------------------------- ----------
  Cleats                                                                                                24551
  Men's Footwear                                                                                22246
  Women's Apparel                                                                              21035
  Indoor/Outdoor Games                                                                     19298
  Fishing                                                                                              17325
  Water Sports                                                                                    15540
  Camping & Hiking                                                                           13729
  Cardio Equipment                                                                            12487
  Shop By Sport                                                                                  10984
  Electronics                                                                                       3156 
 
 

10 rows selected.

Yenugula Venkata Ravikumar é um DBA com mais de 18 anos de experiencia com Oracle e em ambientes de alta disponibilidade (RAC, Data Guard, dentre outros), tuning e desempenho, migrações, backup e recover, Oracle Exadata X2 e X3, é Expert em sistemas operacionais tais como como AIX, HP-UX e Linux. Já participou como conferencista de Oracle pela India, onde mora atualmente. Obteve o título de "Oracle Certified Master (OCM 10g)" em 2009.

Adrian Neagu é um DBA com mais de 15 anos de experiência em diversas tecnologias de banco de dados. Ele é Oracle Certified Master 10g e 11g, Oracle Certified Professional 9i, 10g, e 11g, Cloudera Certified Administrator for Apache Hadoop, IBM DB2 Certified Administrator version 8.1.2 e 9, IBM DB2 9 Advanced Certified Administrator 9, e Sun Certified System Administrator Solaris 10.

Alex Zaballa, formado em Análise de Sistemas, é especialista em Banco de Dados Oracle com sólidos conhecimentos em Servidores de Aplicação e Sistemas Operacionais; trabalha com Oracle há 17 anos, é Oracle ACE Director, certificado OCM Database 12c/MAA/11G/Cloud e conta com mais de 235 outras certificações em produtos da Oracle. Alex também é membro do Groupo de Usuários Oracle do Brasil (GUOB), fundador do Grupo de Usuários Oracle de Angola (GUOA) e membro do time OraWorld.

Este artigo foi revisto pela equipe de produtos Oracle e está em conformidade com as normas e práticas para o uso de produtos Oracle.