Creating Team Speed
By Sanjay Mishra
Use parallel execution to field a team for big processes.
When the primary mode of transport was horse-drawn carts, one horse was enough to draw the cart and a small load of cargo. When you needed to transport a bigger load, you used carts that could be pulled by two or more horses.
And when you have a huge database task to perform (a large query, for example), what do you do? If you can provide enough resources (CPU and memory) to Oracle Database, it can complete your task faster by employing more "horses," in the form of parallel execution.
What Is Parallel Execution?
When using parallel execution, Oracle Database breaks a larger task down into multiple smaller subtasks and executes them in parallel, thereby reducing the total elapsed time, compared to serial execution.
Parallel execution includes parallel query ( SELECT statements), parallel DDL ( CREATE TABLE, CREATE INDEX , and so on), parallel DML ( INSERT, UPDATE, DELETE, MERGE ), and parallel data loading capabilities.
Behind the Scenes
Before going into the details of how parallel execution works, let's first recap the serial execution of SQL statements. When a user executes a SQL statement from a client (which may be a SQL*Plus session or an application written in C, C++, Java, Perl, or some other language), Oracle Database starts a server process. The server process is responsible for executing the SQL statement and returning the results to the client (called the user process). Figure 1 illustrates this simple scenario for a query such as the following:
SELECT COUNT(*) FROM ORDERS;
The scenario changes for parallel execution. The server process has a different, higher-level role during parallel execution—that of a parallel execution coordinator. The server process doesn't do much of the actual work anymore; rather, it coordinates the work done by a set of other processes known as parallel execution servers. Figure 2 illustrates this scenario for a query such as the following:
SELECT /*+ PARALLEL (ORDERS, 4) */ COUNT(*) FROM ORDERS;
The task of executing the SELECT statement is split up into multiple smaller subtasks. The server process employs two or more parallel execution server processes to perform the subtasks during parallel execution, and each subtask is assigned to a parallel execution server process.
When to Use Parallel Execution
Parallel execution splits one larger task into multiple smaller subtasks and then executes them simultaneously and with separate parallel execution server processes. Therefore, to successfully implement parallel execution, you need CPU and memory resources and sufficient IO bandwidth to support the additional processes. Almost all modern-day database servers are multi-CPU boxes, and parallel execution can effectively utilize the power of multiple CPUs to yield quicker results from the database. But is parallel execution equally applicable to all types of applications?
Splitting a task into multiple smaller subtasks; spawning multiple parallel execution server processes; assigning each subtask to a parallel execution server process; coordinating all these processes; and, finally, combining results from the processes introduces some processing overhead to the overall execution of a task. Therefore, for parallel execution to be an advantage, the execution time of a task must justify this overhead. If a task takes only a few seconds with serial execution, it may not be beneficial to use parallel execution. On the other hand, if a task takes a significant amount of time, it makes sense to investigate whether parallel execution can help.
Most SQL statements in online transaction processing (OLTP) applications, for example, take only a few seconds with serial execution, so it may not be beneficial to use parallel execution for many OLTP application tasks.
Almost all OLTP systems, however, have some batch jobs, usually scheduled to run one at a time, that are executed during off-peak hours (usually nightly). Examples of such batch jobs are data extraction processes to transfer data to data warehouses, index or partition maintenance jobs, data archiving jobs, and optimizer statistics collection. Parallel execution can greatly benefit such resource-intensive batch jobs in the OLTP applications. Because you can (usually) dedicate all your CPU and memory resources to one job, you can afford to employ multiple parallel execution server processes to complete the job more quickly.
The typical workload of data warehousing applications—such as loading large volumes of data, generating summary and aggregate information, reporting, and querying—involves large quantities of data and very few concurrent users. Parallel execution can also fit very well in such applications and help improve database performance, by exploiting the multiple CPUs on the database server.
Before employing parallel execution, you should ascertain free CPU and memory resources on the database server during normal and peak load. All operating systems provide utilities for ascertaining CPU and memory utilization (for example, the top utility on most UNIX and Linux platforms and the Task Manager and Performance Monitor on the Microsoft Windows platform). Using these utilities, you can find out whether and when you have free resources that you can utilize for parallel execution.
If the CPU is heavily loaded, parallel execution is not recommended; if you try to use parallel execution when you don't have enough CPU resources, your task could end up taking longer than if you employed serial execution, because the multiple parallel execution server processes will compete for the scarce CPU resources and will wait in the OS scheduler queue.
The following types of operations can be parallelized in an Oracle database:
Parallel query. Parallelizing SELECT statements is the most common usage of parallel execution. SELECT statements that perform a full table scan, a full index scan, a partition range scan involving multiple partitions, or an index range scan involving multiple index partitions can be parallelized. You can influence the parallel execution of a SELECT statement at the individual statement level or at the table or index level.
To invoke parallel execution for a particular query, you can specify a PARALLEL hint:
SELECT /*+ PARALLEL (ORDERS, 4) */ COUNT(*) FROM ORDERS;
Alternatively, you can include a parallel declaration for the corresponding table, using the ALTER TABLE statement:
ALTER TABLE ORDERS PARALLEL (DEGREE 4);
After you define a table with a parallel attribute, any SELECT statement on that table involving a full table scan can be executed in parallel. The following now invokes parallel execution for the ORDERS table, even though there is no PARALLEL hint:
SELECT COUNT(*) FROM ORDERS;
Parallel query is enabled by default in each session; you don't have to do anything special to enable it. If the execution plan involves a full table or index scan or a partition-range scan and you have a PARALLEL hint or a parallel table or index declaration, Oracle Database will invoke parallel execution for your query. However, if you want to disable parallel query in a particular session, you can do so by using the following statement:
ALTER SESSION DISABLE PARALLEL QUERY;
Instead of disabling parallel query for the whole session, if you just want to access a table in serial in a query (when the table has a parallel declaration), you can use the NO_PARALLEL hint:
SELECT /*+ NO_PARALLEL (ORDERS) */ COUNT(*) FROM ORDERS;
Parallel DML. You can also parallelize INSERT, UPDATE, MERGE , and DELETE statements. This strategy, called parallel DML, is very useful in applications involving data load or refresh from staging tables, populating summary tables, and executing large batch jobs. INSERT INTO ... VALUES (...) statements insert a single row of data into a table; therefore, the opportunity for parallel execution doesn't arise. However, INSERT INTO ... SELECT ... FROM statements can take advantage of parallel execution.
Parallel DML is not enabled by default in a session; you need to enable it explicitly by executing the following command:
ALTER SESSION ENABLE PARALLEL DML;
Once you enable parallel DML for a session, you can either use a PARALLEL hint or make use of the parallel declaration of a table (or index) to invoke parallel DML. For example:
INSERT /*+ PARALLEL (TEMP_ORDERS, 4) */ INTO TEMP_ ORDERS SELECT * FROM ORDERS;
In the above example, the SELECT is performed in serial execution but the INSERT is performed in parallel execution. You can parallelize the SELECT as well. The following example illustrates how you can parallelize the SELECT and the INSERT at the same time:
INSERT /*+ PARALLEL (TEMP_ORDERS, 4) */ INTO TEMP_ORDERS SELECT /*+ PARALLEL (ORDERS, 4) */ * FROM ORDERS;
Note that while using parallel DML, you need to be aware of some limitations imposed by a parallel DML operation. Once a table has been modified by a parallel DML statement, that table cannot be accessed in a subsequent SQL statement (serial or parallel, DML or query) in the same transaction before the transaction is committed. Parallel DML operations are not permitted on tables with triggers, self-referential integrity constraints, DELETE CASCADE integrity constraints, or deferred integrity constraints, or on clustered tables. In Oracle Database 10g Release 1 and earlier releases, DML operations for a segment (a nonpartitioned table or a single partition of a partitioned table) cannot be parallelized when bitmapped indexes are present.
Parallel DDL. Decision support applications usually involve creation of summary tables (with aggregated or rolled-up data) and temporary tables. Parallel DDL can be very useful in such applications. You can create and rebuild indexes with parallel DDL as well. CREATE TABLE ... AS SELECT, CREATE INDEX ... , and ALTER INDEX ... REBUILD are some of the most common uses of parallel DDL. If you are working on a partitioned table, you can also invoke parallel execution to MOVE, SPLIT, COALESCE , or MERGE partitions. You can use parallel DDL to REBUILD or SPLIT a partitioned index. To invoke parallel DDL, you need to use the PARALLEL clause with these statements. The following example illustrates how to use parallel execution to split partition P1 of a range-partitioned table ORDERS_RP :
ALTER TABLE ORDERS_RP SPLIT PARTITION P1 AT (1000) PARALLEL (DEGREE 4);
Parallel statistics collection. Collecting optimizer statistics can take a significant amount of time, especially for a large table with several indexes. The DBMS_STATS package, however, provides the ability to parallelize statistics collection. The stored procedures in the DBMS_STATS package provide a DEGREE argument through which you can specify the degree of parallelism. The following example illustrates how to use a degree of parallelism while gathering optimizer statistics for a table:
EXECUTE DBMS_STATS .GATHER_TABLE_STATS( OWNNAME=>USER, TABNAME=>'TEMP_ORDER', DEGREE=>4);
Parallel refresh of materialized views. Refreshing a materialized view involves executing the underlying query. Using parallel execution can refresh the materialized view faster than a serial refresh. If you create a materialized view with a PARALLEL hint, you can refresh it by using parallel execution. The following example shows how you create the materialized view with the PARALLEL hint and then refresh it by using the DBMS_MVIEW.REFRESH procedure:
CREATE MATERIALIZED VIEW MV_ORDER_SUMMARY AS SELECT /*+ PARALLEL (ORDERS, 4) */ STATUS, COUNT(*) FROM ORDERS GROUP BY STATUS; EXECUTE DBMS_MVIEW.REFRESH(' MV_ORDER_SUMMARY');
Degree of Parallelism
The number of parallel execution servers employed for an operation is called its degree of parallelism, which can be specified at the statement level (with the PARALLEL or PARALLEL_INDEX hints) or at the table-definition level (with the PARALLEL clause of the corresponding CREATE or ALTER statement).
The following example specifies a degree of parallelism of 4 for the SELECT statement:
SELECT /*+ PARALLEL (ORDERS_RP, 4) */ COUNT(*) FROM ORDERS_RP;
Through the hint, the above example requests a degree of parallelism of 4. To verify that four parallel execution servers actually are being used for this task, you can query the V$PX_SESSION dynamic performance view while the above statement is being executed. Listing 1 shows a query against V$PX_SESSION and the output.
Listing 1 shows that the session from which you are running the parallel query ( SID=15, SERIAL#=240 ) is the query coordinator and that four parallel execution server processes have been enlisted to perform the task. The query coordinator for each of them—the session originating the query—is the same ( QCSID=15, QCSERIAL#=240 ). The degree of parallelism is output in the DEGREE column (actual degree of parallelism). If your database server runs on a UNIX or Linux platform, you can actually see ora_pxxx_<SID> processes using the ps -ef | grep ora_command (where xxx is an integer representing the parallel execution server number).
Code Listing 1: Looking at the V$PX_SESSION view
SELECT QCSID, QCSERIAL#, SID, SERIAL#, SERVER#, DEGREE FROM V$PX_SESSION; QCSID QCSERIAL# SID SERIAL# SERVER# DEGREE ------- ------------ ---- -------- --------- -------- 15 240 20 43 1 4 15 240 16 191 2 4 15 240 19 388 3 4 15 240 22 92 4 4 15 15 240
Each of the statements we have seen so far in this article consists of a single parallelizable operation (table scan, index scan, or partition scan). However, many statements consist of more than one operation. In these cases, two operations of the SQL statement can be executed simultaneously by use of two sets of parallel execution servers. For example, if a statement consists of a full table scan and a GROUP BY clause, the full table scan is performed by use of one set of parallel execution servers, and the GROUP BY is performed simultaneously by another set of parallel execution servers, as in the following:
SELECT /*+ PARALLEL (ORDERS_RP, 4) */ STATUS, COUNT(*) FROM ORDERS_RP GROUP BY STATUS;
The table scan operation is called the producer operation, and the GROUP BY operation is called the consumer operation. As soon as the producer operation starts generating rows, the consumer operation gets started with the grouping operation. Concurrent execution of multiple operations from one SQL statement in parallel is referred to as interoperational parallelism.
Figure 3 illustrates the above example: Eight parallel execution server processes are enlisted to perform the task required by the query, even if the degree of parallelism specified in the SQL statement is 4. Each operation executes with a degree of parallelism of 4, in which four parallel execution servers perform the table scan and four others perform the GROUP BY operation.
Some SQL statements may consist of three or more operations, but Oracle allows interoperational parallelism between two operations only. After the first operation is completed, the first set of parallel execution servers moves on to perform the third operation, and so on.
The Pool of Parallel-Execution Servers
When you issue a parallel SQL statement with degree of parallelism of four, the query coordinator recruits four parallel-execution servers to perform the task. However, the four parallel-execution servers are not necessarily started when you issue the query. It takes time to start a parallel-execution server process. Therefore, to avoid the overhead of starting parallel-execution servers as and when you issue a parallel operation, Oracle maintains a pool of parallel-execution server processes. When a parallel-execution operation starts, the parallel-execution servers are enlisted from this pool and remain assigned to the operation until it completes. After that, they return to the common pool.
As you execute more and more parallel operations, additional parallel-execution servers are created and assigned to these tasks, but only up to a certain limit. This limit is controlled by the initialization parameter PARALLEL_MAX_SERVERS , which specifies the maximum number of parallel-execution server processes allowed on that instance. Once the number of parallel-execution servers reaches this limit, no new ones are started.
What happens when you don't have enough parallel-execution servers available? For example, suppose you start a query with a requested degree of parallelism of four, but only two parallel-execution servers are available, and the number of parallel-execution servers has reached PARALLEL_MAX_SERVERS . In this case, the query will execute with a degree of parallelism of two—the two available parallel-execution servers will perform the task. Now, let's assume you execute another query with a requested degree of parallelism of four. There are no available parallel-execution servers, and new ones can't be started, because the count has reached the limit specified by the PARALLEL_MAX_SERVERS initialization parameter. In this case, your query will execute in serial.
You can control the behavior of parallel execution when there are not enough available parallel-execution servers by setting the PARALLEL_MIN_PERCENT initialization parameter for your session through the ALTER SESSION command. The default value for PARALLEL_MIN_PERCENT is zero. This means that when there are not enough available parallel-execution servers, and their count has reached the PARALLEL_MAX_SERVERS limit, your task will execute with either a reduced degree of parallelism or in serial, depending on how many parallel-execution servers are available. To identify how many parallel operations were executed with a reduced degree of parallelism or in serial, query the V$SYSSTAT view, as shown in Listing 2.
Code Listing 2: Looking at the V$SYSSTAT view
SELECT NAME, VALUE FROM V$SYSSTAT WHERE NAME LIKE 'Parallel%'; NAME VALUE --------------------------------------------------- -------- Parallel operations not downgraded 9 Parallel operations downgraded to serial 2 Parallel operations downgraded 75 to 99 pct 0 Parallel operations downgraded 50 to 75 pct 0 Parallel operations downgraded 25 to 50 pct 1 Parallel operations downgraded 1 to 25 pct 1
If you set the PARALLEL_MIN_PERCENT parameter to anything higher than zero, your SQL statement will be executed only when the minimum percentage of parallel-execution servers is available. For example, if you set PARALLEL_MIN_PERCENT to 50, and you execute a statement with a requested degree of parallelism of ten when only four parallel-execution servers are available, your query will fail with an error message instead of executing with a reduced degree of parallelism, as follows:
ALTER SESSION SET PARALLEL_MIN_PERCENT = 50; INSERT /*+ PARALLEL (TEMP_ORDERS, 10) */ INTO TEMP_ORDERS SELECT * FROM ORDERS; INSERT /*+ PARALLEL (TEMP_ORDERS, 10) */ INTO TEMP_ORDERS * ERROR at line 1: ORA-12827: insufficient parallel query slaves available
What happens to the parallel-execution server processes after the execution of the parallel operation is complete? They return to the pool. If a parallel-execution server remains idle too long, the database instance automatically terminates it. While terminating parallel-execution servers, Oracle Database ensures that the pool doesn't go below the minimum limit specified by PARALLEL_MIN_SERVERS.
Monitoring and Tuning Parallel Execution
Oracle Database provides several dynamic performance views to monitor the performance of parallel execution in the database. You can view the parallel-execution servers in the pool through the dynamic performance view V$PX_PROCESS . This view lists the parallel-execution servers in the pool, as shown in Listing 3.
Code Listing 3: Looking at V$PX_PROCESS
SELECT * FROM V$PX_PROCESS; SERV STATUS PID SPID SID SERIAL# ----- --------- ------------- ------------ ------------ ------------ P000 AVAILABLE 9 2835 P001 AVAILABLE 10 2837
The STATUS column V$PX_PROCESS indicates whether the corresponding parallel-execution server is IN USE (serving a current parallel operation) or AVAILABLE (free to be picked up by the next parallel operation). To see a list of parallel-execution sessions in progress, query the view V$PX_SESSION , as shown above.
After executing a task involving interoperational parallelism (for example, the query shown in the pertinent section above), execute the query in Listing 4 (in the same session in which you executed the parallel operation) to identify the producer-consumer relationship among the parallel-execution server processes.
Code Listing 4: Identify the producer-consumer relationship
SELECT DFO_NUMBER, TQ_ID, SERVER_TYPE, PROCESS, NUM_ROWS, BYTES FROM V$PQ_TQSTAT ORDER BY DFO_NUMBER, TQ_ID, SERVER_TYPE; DFO_NUMBER TQ_ID SERVER_TYP PROCESS NUM_ROWS BYTES -------------- ---------- -------------- ------------ ------------- ------------ 1 0 Consumer P003 1 96 1 0 Consumer P002 1 102 1 0 Consumer P001 1 97 1 0 Consumer P000 4 144 1 0 Producer P004 1 96 1 0 Producer P007 4 151 1 0 Producer P006 1 96 1 0 Producer P005 1 96 1 1 Consumer QC 4 130 1 1 Producer P003 1 29 1 1 Producer P000 1 32 1 1 Producer P001 1 32 1 1 Producer P002 1 37
The output in Listing 4 indicates that the parallel-execution servers P004, P005, P006, and P007 produce the rows by scanning the table. The rows produced are then consumed by the parallel-execution server processes P000, P001, P002, and P003 to perform the group by operation. For this second stage of the operation, the processes P000, P001, P002, and P003 become producers and the parallel-execution coordinator becomes the consumer.
Oracle provides several initialization parameters to control how parallel execution works. The most basic of these parameters are PARALLEL_MAX_SERVERS , PARALLEL_MIN_SERVERS , and PARALLEL_MIN_PERCENT . These parameters control the number of parallel-execution server processes in the pool and the number of such processes acquired by an operation when sufficient processes are not available in the pool. The most interesting of all the parallel execution-related initialization parameters is PARALLEL_ADAPTIVE_MULTI_USER , which determines the behavior of parallel execution when parallel operations are performed in a multiuser scenario.
When several concurrent users are executing parallel operations with a high degree of parallelism, it is easy to overload the CPUs in the system with too many parallel-execution server processes. When the parameter PARALLEL_ADAPTIVE_MULTI_USER is set to TRUE (the default), Oracle Database utilizes an adaptive algorithm to estimate the workload on the system, and then, if required, executes the tasks with a lesser degree of parallelism than requested. When this parameter is set to FALSE , each parallel operation acquires the requested number of parallel-execution server processes (provided, of course, there are enough such processes available in the pool).
Parallel execution enables effective utilization of multiple CPUs on the database server by breaking up larger tasks into multiple smaller subtasks and executing the smaller subtasks in parallel. Long-running database tasks that involve large amounts of data can often greatly benefit from parallel execution. Oracle Database enables parallel execution of a wide range of database operations, including queries, DML and DDL operations, gathering statistics, and refreshing materialized views.
Sanjay Mishra (firstname.lastname@example.org) has coauthored five Oracle books, including Oracle Utilities Pocket Reference and the recently published second edition of Mastering Oracle SQL —all published by O'Reilly Media.