Performance and Availability

Exploiting Parallel Operations with Real Application Clusters
by Kevin Conlon

How to use parallel operations within a RAC environment in order to exploit all the server hardware in your cluster architecture

Use of parallel operations with Oracle Database is well known, with the feature first having arrived in version 7.1. The ability to execute SQL statements in parallel on traditional UNIX-based symmetric multiprocessor (SMP) architectures greatly increased utilization of the server and increased the speed of large resource-intensive operations. In Real Application Clusters (RAC) architecture, the equivalent to parallel SMP deployment is the use of all the available servers in the cluster.

In this article, I'll outline the use of parallel operations with RAC to exploit scale-out cluster architecture.

Parallel Goals and Options

The goal of parallelism is to exploit all available resources of the database platform architecture to increase overall processing speed. These resources include memory, processor, and I/O.

The same parallel operations that can be performed in any scale-up or single-system SMP image environment can also be performed in the scale-out RAC cluster environment. These operations include:

  • Queries (based on full table scan)
  • Create Table As
  • Index Builds
  • DML Operations (insert, update, delete) on partitioned tables
  • Data Loads

Operations 1 through 4 from the list above can be performed with the use of standard SQL hints or by setting a parallel degree at the object level INIT.ORA parameters that can be employed to restrict parallel operations to specific nodes. Thus, for a larger RAC architecture (more than two servers), specific servers can be allocated to named groups to restrict or enable operation.

Usage Models

There are a number of usage models for parallel execution with RAC. These include:

  • Standard use of parallel query for large data sets. In this scenario, the degree of parallelism can be defined to utilize all of the available resources across the cluster.
  • Use of restricted parallel query. This scenario restricts the processing to specific nodes in the cluster. Thus nodes can be logically grouped for specific types of operations.
  • Parallel index builds/rebuilds. In cases where large index builds are required, parallelism can be exploited to maximize the use of cluster node resources. Similar to the query scenarios, these operations can be restricted by logically grouping servers.

INIT.ORA Parameters

Relative to parameter control, there are the standard parameters that configure the parallel processes at the server level. The only change for a RAC environment is that each server in the cluster supports a separate group of these processes. However, these parameters can be set globally across all instances or at the specific instance level.

The two general parallel parameters are outlined in Table 1.

Parameter Name Type Description
parallel_max_serversintegerMaximum number of parallel processes per node
parallel_min_servers integerMinimum number of server processes per node

Table 1: General Parallel INIT.ORA Parameters

There is one RAC-specific parameter that is outlined in Table 2.

Parameter NameTypeDescription
instance_groups stringDefines the logical groups for enabling processing of specific servers

Table 2: RAC-Specific Parallel INIT.ORA Parameters

Sample RAC Architecture

For the examples outlined in this document, a two-node RAC architecture running under Red Hat Advanced Server 2.1 will be utilized. Figure 1 contains a diagram of the test environment.

figure 1

V$ Tables for Viewing Parallel Statistics

There are a handful of tables to view parallel operation statistics. The two most useful are outlined in Table 3. The table names below are prefaced with the GV$ identifier. For RAC implementations, there is an additional group of global views that contain one additional column for the instance identifier. Other than this column, the remaining columns are identical to the V$ structures of the same name.

View NameDescription
GV$PQ_SYSSTATAll parallel related statistics for the entire RAC configuration
GV$PQ_SESSTATSession-specific parallel statistics by session ID

Table 3: Parallel Operation V$ Related Tables

Parallel Setup and Baseline

The base INIT.ORA parameters for the tests are defined below. Note that the "*" defines these as global across all of the RAC instances.

*.parallel_max_servers=5
*.parallel_min_servers=2

The listing below shows that the SALES1 and SALES2 instances each initiated two parallel background processes on database startup.

UID      PID     PPID  C STIME TTY      TIME CMD
oracle   28216   1     0 11:07 ?        00:00:00 ora_p000_SALES1
oracle   28218   1     0 11:07 ?        00:00:00 ora_p001_SALES1

oracle     620   1     0 11:09 ?        00:00:00 ora_p000_SALES2
oracle     622   1     0 11:09 ?        00:00:00 ora_p001_SALES2

A query of the GV$PQ_SYSTAT table shows the base status of the parallel processes within the Oracle kernel.

select inst_id,statistic,value
  from gv$pq_sysstat
 where value > 0
order by 1, 2;

INST_ID   STATISTIC                             VALUE
--------  --------------------------------  ---------
       1  Servers Busy                              1
          Servers Idle                              1  
          Servers Highwater                         1
          Server Sessions                           1

       2  Servers Busy                              1
          Servers Idle                              1
          Servers Highwater                         1
          Server Sessions                           1
          

Parallel Query Test Examples

In this section we'll examine the use of parallel query with the two-node RAC architecture outlined earlier in this paper. There will be two types of tests performed:

  1. An unbounded test where the query will run across both RAC nodes
  2. A bounded test where the query will be restricted to a single RAC node.

Test Case #1: Unbounded Test. In the unbounded test, a simple query is run utilizing standard SQL with parallel hints. As with any query, to utilize parallel operations a full table scan must be part of the statement.

select /*+ full(c_stock) parallel(c_stock,6) */ sum(s_quantity) odcnt
from c_stock
/

Based on the parallel hint requesting six parallel workers, three processes are initiated on each of the server nodes.

UID        PID  	   PPID  C STIME TTY     TIME CMD
oracle     13878     1     0 14:17 ?       00:00:03 ora_p000_SALES1
oracle     13880     1     0 14:17 ?       00:00:03 ora_p001_SALES1
oracle     13940     1     1 14:23 ?       00:00:02 ora_p002_SALES1

oracle     18601     1     0 14:20 ?       00:00:01 ora_p000_SALES2
oracle     18603     1     0 14:20 ?       00:00:01 ora_p001_SALES2
oracle     18652     1     3 14:28 ?       00:00:01 ora_p002_SALES2

The statistics from the GV$PQ_SYSTAT show that each of the instances started an additional server.

INST_ID    STATISTIC                           VALUE
---------- ------------------------------ ----------
1          DFO Trees                               5
           Distr Msgs Recv'd                      49
           Distr Msgs Sent                        49
           Local Msgs Recv'd                     110
           Local Msgs Sent                       101
           Queries Initiated                       5
           Server Sessions                         9
           Servers Busy                            1
           Servers Highwater                       3
           Servers Idle                            1
           Servers Shutdown                        1
           Servers Started                         1
           Sessions Active                         1

2          Distr Msgs Recv'd                      12
           Distr Msgs Sent                         6        
           Server Sessions                         6
           Servers Busy                            1
           Servers Highwater                       3
           Servers Idle                            1
           Servers Shutdown                        1
           Servers Started                         1

Test Case #2: Bounded Test. To restrict parallel processing to specific cluster nodes, instance groups are employed to create logical server groupings. Control is via the INIT.ORA with the parameter INSTANCE_GROUPS. INSTANCE_GROUPS is a RAC-related parameter that is specified only in parallel mode. Used in conjunction with the runtime parameter PARALLEL_INSTANCE_GROUP, it allows for the restriction of parallel query operations to a limited number of instances.

For the tests in this section, the INSTANCE_GROUPS identified below will be utilized.

# Init.ora Parameter Setting for Parallel Options
SALES1.INSTANCE_GROUPS='marketing'
SALES2.INSTANCE_GROUPS='finance'

The session below is altered prior to the execution of the query to be assigned to the FINANCE group. Even though the query is initiated on the SALES1 node, based on the INSTANCE_GROUP setting all of the processing will be executed on SALES2.

alter session set parallel_instance_group = 'finance';

select /*+ full(c_stock) parallel(c_stock,6) */ sum(s_quantity) odcnt
from c_stock
/

Note in the process listing that all of the parallel workers requested are indeed run only on the SALES2 node, as there is no CPU time being utilized by the processes on SALES1.

UID        PID     PPID  C STIME TTY   TIME CMD
oracle     29994   1     0 14:13 ?     00:00:00 ora_p000_SALES1
oracle     29996   1     0 14:13 ?     00:00:00 ora_p001_SALES1

oracle     2631    1     0 14:51 ?     00:00:01 ora_p000_SALES2
oracle     2633    1     0 14:51 ?     00:00:01 ora_p001_SALES2
oracle     2676    1     4 14:57 ?     00:00:01 ora_p002_SALES2
oracle     2678    1     3 14:57 ?     00:00:01 ora_p003_SALES2
oracle     2680    1     4 14:57 ?     00:00:01 ora_p004_SALES2

A query of the GV$PQ_SYSTAT table also shows that an additional three servers were started on the second SALES2 instance. Why only three servers and not four? Remember the setting of the INIT.ORA parameter parallel_max_servers. The value of the parameter is five, thus only an additional three are added to the initial two.

INST_ID    STATISTIC                           VALUE
---------- ------------------------------ ----------
1          DFO Trees                               3
           Distr Msgs Recv'd                      74
           Distr Msgs Sent                        74          
           Local Msgs Recv'd                       2
           Local Msgs Sent                         1   
           Queries Initiated                       3
           Server Sessions                         1
           Servers Busy                            1 
           Servers Highwater                       1
           Servers Idle                            1
           Sessions Active                         2

2          Distr Msgs Recv'd                      22   
           Distr Msgs Sent                        11
           Server Sessions                        11
           Servers Busy                            6
           Servers Highwater                       6
           Servers Started                         3

In the example above, the query was restricted to the SALES2 node by using the FINANCE instance group. The INIT.ORA example below allows for the FINANCE instance group to now run across both the SALES1 and SALES2 nodes. Note that the INIT.ORA parameter INSTANCE_GROUPS must be entered explicitly for each of the groups.

# Init.ora Parameter Setting for Parallel Options
SALES1.instance_groups='marketing'
SALES1.instance_groups='finance'
SALES2.instance_groups='finance' 

Create Table As

Utilizing the Create Table As (CTAS)feature within Oracle can be extremely useful for making copies of table objects. For large tables, the operation can be performed in parallel in the same manner as with the parallel query examples in the prior section. The SQL statement below is an example of the use of CTAS with the parallel option. Instance groups can also be used to restrict processing to specific nodes. Thus, based on the INSTANCE_GROUPS parameter, the execution of the query would be performed only on the SALES1 node.

alter session set parallel_instance_group = 'marketing';

create table c_district_backup parallel (degree 3)
as
select *
from c_district
/

Index Builds

Performing index creates or rebuilds for large tables is another resource-intensive operation where performance can be greatly improved with parallel operations. The index create statement requests a parallel degree of six for the operation. Similar to the previous examples, this operation can also utilize the INSTANCE_GROUPS parameter to restrict the operation to specific nodes.

alter session set parallel_instance_group = 'marketing';

create unique index C_STOCK_I1 on C_STOCK (s_i_id, s_w_id)
tablespace stock_indx
parallel (degree 6)
/ 

Performance Considerations

The downside of parallel operations is the exhaustion of server resources. The easiest server resource to monitor is CPU utilization. If normal CPU utilization were relatively high, deploying a large number of parallel processes would not be advisable. Exceeding the total number of CPUs would cause performance degradation as well.

Data layout is another immediate consideration. If I/O bottlenecks currently exist, use of parallel operations may exacerbate this condition. Ensure that data files for parallel target objects are spread across a reasonable number of disk spindles.

Conclusion

The use of parallel operations within the RAC environment provides for the flexibility to utilize all the server hardware that is part of the cluster architecture. Utilizing instance groups, database administrators can further control the allocation of these resources based on application requirements or service level agreements.


Kevin Conlon is World Wide Practice Manager for Intel Solution Services. He has more than 18 years of IT industry experience spanning application programming, system and database administration, data center architecture, and operations management.



Please rate this document:

Excellent Good Average Below Average Poor


Send us your comments

E-mail this page
Printer View Printer View
Oracle Is The Information Company About Oracle | Oracle RSS Feeds | Careers | Contact Us | Site Maps | Legal Notices | Terms of Use | Privacy