Type Your Title Here

Welcome to the Parallel Query Dimension of Oracle8

Paper #200
Ian Abramson
Ian Abramson Systems Inc
Toronto, ON, Canada

Parallelisms, symmetry, equivalence—these words and concepts all have something in common. They demonstrate how the total exceeds the sum of its parts. Goalies may be able to stop every shot aimed at them, but without the rest of the team, a goalie is powerless to win without a great deal of luck. In the world of computers, this concept is applicable in terms of parallelism. Parallel processing allows great computer with a whole team of CPUs, to work together to create an even greater computational power. Parallel processing has existed for many years, but with Oracle8 the power of multiple processors and servers takes advantage of a computer topology that can further empower the database to increase application and server productivity.

The question is: What do I need to be able to take advantage of these parallelism features incorporated within Oracle8? First you need a computer, but will any computer do? Possibly, as long as the computer can support multiple CPUs and you have bought, configured and installed these CPUs, you have the opportunity to enter into the parallel query world. In today’s world this means that you would require a machine and operating system that supports two or more processors, and an operating system that can address and share processing amongst these processors. These machines share memory and disks, but spread their processing power amongst the numerous CPUs. The reason that parallel process provides us with so much of a performance boost is that we often run into processing bottlenecks, not I/O bottlenecks. By implementing solutions that use the Parallel Query feature, you will start to experience new problems. You may then find that I/O is a bottleneck, since the CPUs will now be hungry for more information at a rate that the system had not experienced during serial operation execution. In this presentation we will discuss how best to use the parallel query feature, how best to implement it, and what to do to ensure that you are using this feature at an optimal level.

Parallel Query Processing

The processing of information using multiple processors requires the Oracle8 Parallel Query feature. Parallel Query allows SQL statements to utilize share processing across these processors simultaneously. When processing SQL on a machine that has only a single processor, all SQL will execute within a single process. Parallel Query allows statements to be divided and utilize multiple processes, resulting in quicker completion of the statement.

This discussion is limited to the Parallel Query and it does not include the use and configuration of the Parallel Server Option. The Parallel Server Option relates to the linking of multiple database servers together, whereas the Parallel Query option can be implemented on a single server that contains multiple processors.

Parallel Query provides significant performance gains for databases that contain large amounts of data. The types of systems that gain the most from parallel operations include:

SMP (symmetric multiprocessor), MPP (massively parallel processor machines) and clusters
High I/O capacity machines (many disks and multiple disk controllers)
Underutilized CPUs (less the 25%)
Availability of memory capacity

If you have a machine that has all of these characteristics, then you are a candidate for implementing parallel processing within your Oracle database. The types of databases that provide the greatest benefit from parallelism are:

Very large databases (VLDBs)
Data warehouses
Financial applications

Generally, the parallel query feature is most useful when the queries require a great deal of time to complete, and when a large number of rows are processed. These database applications share a number of common characteristics. The most common characteristic is size. These applications are information intensive. The amount of storage required is very large, to maximize the retrieval of the information the parallel query feature spreads the SQL among the active CPUs. This load sharing is performed by efficiently splitting up a request over the many processors running on the system. As noted previously, by working as a team the goal of completing an operation can be reached sooner and require a smaller amount of effort.

At execution time Oracle, along with the server’s multiple processors, work together to distribute the database operation statement. Interestingly enough, the splitting of work by the parallel query engine is dynamic; if there is any changes to the server’s configuration, it will be adapted to by Oracle at the time of the statement request.

It is important to note that if your system is already heavily loaded that the gains expected by parallel query will be reduced. Ensure that your server has the available cycles for implementing a large information intensive database. Also verify that you have optimized your current CPU usage, disks and disk controllers before embarking on a parallelism initiative.

The SQL statements and database functions that benefit from parallelization with parallel query are the following:

select statements
not in statements
group by statements
create table ... as select ... statements
create index
nested subqueries in insert, update, and delete statements
update statements
delete statements
rebuild index
parallel data loading with SQL*Loader
parallel replication
alter table move partition
alter table split partition
star transformation

The above statements will take advantage of parallel query through the proper use of these statements. Together with the configuration of the database, it will all provide results in a more timely manner.

Oracle will parallelize operations in the following ways:

When DML and DDL parallelize by block ranges.
When operations access partitioned tables and indexes.
Parallel execution by parallel query server processes.

The database parallelizes the SQL statement at execution time. At this time it divides the table or index into ranges of database blocks and then executes on these ranges in parallel. This processing on the ranges are performed by ROWID and each statement will access the information in parallel for a high and low range of ROWIDs. In the case of partitioned data, the information is not accessed by this range, but by the defined partitions. The information is queried by a set of ROWIDs from within each partition and no scan can overlap two partitions.

Partitions are excellent candidates for parallel execution as these data sets can be easily divided into smaller working groups to allow for efficient information interrogation. Basically each partition becomes a candidate for assignment to an individual parallel query server process. In some cases, the number of parallel processes may be less than the number of partitions; this is due to system limits or table attributes. This is not a problem since a parallel query server process can access multiple partitions. If you access only a single partition during the statement execution, the optimizer will understand not to perform the statement in parallel and will perform the statement serially. Inserts are parallelized during execution, as they will be divided among the parallel query server processes.

Look and Feel of Parallel Query
The parallel query feature has a number of different server processes that manage and execute the processing of SQL statements. The first is the query coordinator (P000 process). This process decides how to distribute the SQL statement among one or many parallel query server processes. The other processes can be identified, as they will appear as Pxxx in the process list. The values of 000 will be the numerical identifier of the parallel query server process, starting with 1 and continuing to the maximum number set by the database configuration. Thus, a configuration with five parallel query server processes will show processes P000, P001, P002, P003, and P004. Configuration and installation of the parallel query feature requires no intervention, since it is part of the base product. By setting the appropriate parameters in your initialization parameter file, you can utilize the parallel features of Oracle8.

The Initialization Parameter File Parallel Style
The configuration of parallel query is a balance of optimizing SQL and configuring the database so that it maximizes the effectiveness of its parallel query server processes. In order to optimize the database configuration, we will investigate some initialization parameter file entries, and suggest how best to set these values to optimize parallel SQL execution.

When the database starts, and the appropriate entries are set in the parameter file, Oracle will create a number of parallel query server processes that may be addressed by the query coordinator. These parallel processes become available for use in order to perform parallel operations. These processes, once assigned to an operation will be retained by the operation until its completion. Once completed, the operation will release the parallel query server process to be available to the next operation. In order to maximize these processes we must look at preparing the database. Let’s look at these parameters.

PARALLEL_MIN_SERVERS
This parameter specifies the minimum number of parallel query server processes that will be initiated by the database at the time of instance startup. To optimize the parallel query server processes for normal database operations, the DBA should consider setting the number of PARALLEL_MIN_SERVERS to the formula shown in the next listing.

PARALLEL_MIN_SERVERS = the likely number of simultaneous parallel operations

By reviewing the information contained in the V$PQ_SYSSTAT data dictionary view, you can identify if the value you have set is too low or too high. The data you are interested in sits in the STATISTIC column with values shown in the next listing.

STATISTIC VALUE
------------------------------ ----------
Servers Busy 0
Servers Idle 0

You are looking for these values to indicate if we have over-committed or under-committed our parallel query server processes.

PARALLEL_MAX_SERVERS
This parameter specifies the maximum number of parallel query server processes that will be spawned when required. At times, when the volume of concurrent operations exceeds the number of current parallel query processes currently running, the query coordinator will start other parallel query server processes up to the number specified in this parameter.

To optimize the parallel query server processes for normal database operations, you should consider setting the number of PARALLEL_MAX_SERVERS to the formula shown in the next listing. The formula is expanded to show the value for a two CPU machine with 30 concurrent users.

PARALLEL_MAX_SERVERS = 2 * # of CPUs * # of concurrent users
= 2 * 4 * 30 = 240

When all parallel query server processes are in use and the maximum number of processes have been reached. The parallel query coordinator will react to the request for processes in excess of PARALLEL_MAX_SERVERS be either switching to serial processing or return an error if involved in a replication role.

PARALLEL_SERVER_IDLE_TIME
When the parallel query coordinator starts processes beyond the initial number started based on the PARALLEL_MIN_SERVERS parameter, the stopping of these additional processes will be based on this parameter. This period is defined as the time from which the number of query processes exceed that set by the minimum number of parallel query server processes and have not been utilized for a period in time by any current operations.

OPTIMIZER_PERCENT_PARALLEL
This parameter determines how aggressively the cost-based optimizer (CBO) will try to parallelize an operation. By default the value is set to 0, so the optimizer will not consider parallelization when determining the best execution plan. You will need to decide how aggressive you want the optimizer to be when it comes to determining the best balance between the execution and parallelization of an operation. The higher that this value is set, up to a maximum of 100, the harder CBO will work to optimize parallel execution. This will determine a plan to minimize execution time based on parallel processing. The optimal setting for this value is shown in the next listing.

OPTIMIZER_PERCENT_PARALLEL = 100/number of concurrent users

When determining if parallel execution is being performed, ensure that the value is set to 100, this will force the operation into a parallel plan unless a serial plan is faster. Remember that the lower the value is set, the optimizer will favor indexes, when the value is set higher the optimizer will favor full table scans.

PARALLEL_AUTOMATIC_TUNING
You can set this parameter to tru. Oracle will then determine the default values for parameters that control parallel execution. In addition to setting this parameter, you must specify the PARALLEL clause for the target tables in the system. Oracle then tunes all subsequent parallel operations automatically.

SHARED_POOL_SIZE
The shared pool size must be reviewed. The parallel processors to send messages back and forth to each other use the shared pool. The reason for the increase in the pool is because parallel operations require execution plans that require twice as much space as serial plans. To optimize your shared pool we recommend that you consider the following formula when determining your shared pool size; our example uses a buffer size of 2k on a machine with a current shared pool entry of 20000000, PARALLEL_MAX_SERVERS of 16, and 8 CPUs.

SHARED_POOL_SIZE =
current value + ((3 * message buffer size ) * (CPUs + 2) * PARALLEL_MAX_SERVERS)
= 20000000 + ((3 * 2048) * (8 + 2) * 16)
20000000 + 6144 * 10 * 16 = 20000000 + 983040 = 20983040 ALWAYS_ANTI_JOIN
This parameter is important when the SQL operation uses the not in operator. By setting this parameter you can tell your database to use parallel functionality when performing anti-joins. By default not in is evaluated as a (sequential) correlated subquery, when the parameter is set to NESTED_LOOPS. Instead Oracle will perform a hash-join that will execute in parallel. To tell the database to do parallel hashing set the parameter as follows:

ALWAYS_ANTI_JOIN = hash

The next SQL statement is an example of the familiar not in construct that the CBO with Oracle8 will react to with the ALWAYS_ANTI_JOIN entry set to HASH. Note the HASH_AJ hint.

select *
from individual
where lastname like 'KERZNER%'
and office_id is not null
and office_id not in
(select /*+ hash_aj */ office
from national_office
where office_id is not null
and role = 'SENATORS');

ROW_LOCKING
This parameter tells the database whether to acquire row level locks during an update operation. The parameter should be set as follows.

ROW_LOCKING = ALWAYS

By selecting ALWAYS or DEFAULT (the are the same), you tell then database to only to get row locks when the table is being updated. If your database is set to INTENT, then locks are acquired when you perform a select for update. This may appear to fine, but by setting the parameter to INTENT, all insert, updates and deletes will performed serially.

COMPATIBLE
This parameter tells the database to use the latest features available within Oracle8. We mention here to remind you to set the value to get the all the parallel features available. The parameter should be set as shown in the next listing.

COMPATIBLE = 8.0.0

Parallel Execution
The execution of all SQL statements goes through the same process, when the database has been configured with the parallel query.

The statement is parsed.
If the statement is a candidate for parallelization, the optimizer determines the execution plan of the statement.
The query coordinator determines the method the parallelization of the operation.

What determines the amount of parallelization is the existence of at least one full table scan or index scan in the execution plan. Oracle8 has now allowed us to utilize parallelism during operations that perform index range scans of a partitioned index and full index scans. To find out if your SQL is a candidate for parallelism, you may evaluate the SQL statement with explain plan.

CBO many times selects a full table scan for many SQL statements; full table scans are ideal candidates as they can greatly benefit from parallelization. Even when indexes are used, one can improve the performance by telling the coordinator to use parallel index processing. To tell the query coordinator to initiate parallel processing, you use hints as discussed in the next few sections.

PARALLEL Hint
The parallel hint (make sure you spell it right!) allows you to tell the query coordinator how many concurrent parallel query server processes should be used for the parallel operation. The four main operations, select, insert, update, and delete, can all take advantage of this hint. Remember that you must identify the table to be parallelized at a minimum. If you do not identify the degree of parallelism or the server instances, the optimizer will set them to the default in the database. If any values exceed those set in the database, the hint will be ignored. The next listing is an example of how this hint is used.

select /*+ parallel (product,4) */ date_range, product, unit_price
from product;

PARALLEL_INDEX Hint
To access information in a table via an index range scans of a partitioned index, the parallel_index hint should be used. This hint tells the query coordinator to spread the index scan across multiple parallel query server processes. The next listing show the format of this hint (for a table being read), where DOP stands for degree of parallelism.

/*+ parallel_index (tablename, DOP, parallel server instance split) */
select /*+ parallel_index (product,4) */ date_range, product, unit_price
from product;

The format of the parallel_index hint is (for an index being scanned) is shown next.

/*+ parallel_index (indexname, DOP, parallel server instance split) */
select /*+ parallel_index (date_range_index,4) */ date_range,
product, unit_price
from product;

NOPARALLEL Hint
When you do not want to use the parallel processing of a select statement, the hint noparallel is available. This hint disables any default parallel processing the query coordinator may attempt to initiate. This hint is the same as saying: /*+ parallel (tablename, 1,1) */. The format of this hint is shown in the next listing.

/*+ noparallel (tablename) */

select /*+ noparallel */ date_range, product, unit_price
from product;

Parallel SQL statements
The performance of information manipulation and retrieval has always been and will continue to be an issue to everyone who has had their computer tied up doing the endless query. You must remember the update that started Monday and then ended when the computer crashed three days later. Oracle7 and Oracle8 have provided us with new strategies to improve the performance of our SQL statements. Remember that SQL that is poorly formatted and configured will still run poorly, but if the code is written efficiently then parallelism will improve performance.

Manual parallel processing is a derivative of the parallel query feature. Looking back into our years of experience, well maybe it was last week at a major bank’s data warehouse project, parallelism helped achieve a goal. Start of "dream" sequence. We created six separate processes that addressed a key value range in a 10,000,000-row table, and started these six wonderfully written programs. This program then allowed the Oracle Server to distribute the processing among the multiple parallel query processes on the server. The moral of this dream—anyone can perform manual parallel processing. Was this the best solution? By spreading out the processes among the many processors, we were able to achieve the overall performance required for the task. The question then becomes what was the trade-off versus serial processing of the same task.

The greatest advantage to us was that we could get the transactions per second required to complete this task in our lifetime. Even with parallel hints, the programs would still not provide the performance required.

For every battle won there is a cost; based on the perceived and estimated cost, decisions need to be made. Let’s look at what influence decisions.

Will the separation of tasks provide significant performance increases?
Do we have enough processes to manage the multiple tasks that will be started?
If one process fails does this affect the other processes? Every process will need the ability to be restarted, and no other process will need to depend on another parallel process
Can we separate the information with a reasonable amount of work?
Is this requirement a candidate for manual parallelism?
If we look at all of the above considerations, what makes an operation a better candidate for parallel execution? Let’s weigh a few items that help you decide on parallel execution.
Performance increase is always a factor in determining the way to approach the problem. What performance increase can you expect by enabling parallelism? This question is one that we are often asked and a difficult one to define. If every increase of 2% is considered a major benefit to your organization, then we must do everything possible to improve performance. If your organization requires a 25% improvement, then the choices are not as clear. In this case parallelism this approach may satisfy both schools of thought and should be considered for all performance reviews.
Available processes are a factor of system capacity. If you have the required number of cycles, more parallel query server processes can be started and a better use of system capacity can be achieved. If you do not have the available capacity, but want to pursue this approach, then a hardware augmentation will need to be performed before you embark on further parallelism.
Interdependence issues try to find answers to questions such as "Is it possible to perform a part of the task in isolation without concern for other operations?" as well as "If tasks can be performed without regard for other operations then it would appear to be a better candidate for parallelism".

This leads to the question of what work is required to separate the information to a level where it can be easily manipulated. Oracle8 has almost removed this question from consideration. If you have been diligent and partitioned your data already, this becomes a non-issue. People out there who want to take advantage of partitions and the parallel nature of the information, ask themselves if the change worth the cost. The best answer we can give to you is that using the features that are now available in your database toolkit must always be considered. So should we implement this feature? If your machine supports parallelism, then this option must be considered.

What Is the Degree of Parallelism?
When the database parallelizes an SQL statement, it parallelizes this statement over a number of parallel query server processes. The number of parallel query server processes used by an operation is known as the degree of parallelism. Without the degree of parallelism, the operation will be performed in serial. If you had wanted that, you would not have purchased 12 new processors, more memory and told everyone it was going to help. By understanding the definition of the degree of parallelism, you can now empower everyone in your organization to use all that new hardware and the new and improved database schema when issuing SQL statements.

The degree of parallelism is defined at the statement level. This is done through the use of imbedded hints within the statement or by default within the database. At the time of table or index creation you can specify a default degree of parallelism for the object. By default this may either be the number of disks or CPUs.

There are two types of parallel operations that we must define at this point as they will affect the degree of parallelism that any operation can perform—intra-operations and inter-operations. These two operations can be performed at the same time during statement execution.

The Degree of Parallelism within Operations
The balance between too much parallelism and too little is always a concern to us. Luckily we have a database that is smarter than your average bear. The Oracle query coordinator is that bear. The coordinator determines the degree of parallelism at runtime, based on number of factors. First, the coordinator looks to see if they are dealing with a smarter than normal programmer who has added hints to their SQL statement. In the case that this has not been done, it will check to see if your DBA was awake when the table or index was created. The coordinator will then look for the object’s default degree of parallelism. Once determined, this degree of parallelism will be used by the operation. If you create a table remember to follow the following format, with the parallel-specific section bolded.

SQL> create table individual (
2 firstname varchar2(20) not null,
3 lastname varchar2(30) not null,
4 birthdate date not null,
5 gender varchar2(1) not null,
6 initial varchar2(1),
7 favorite_beatle varchar2(6))
8 parallel (degree 4);
Table created.
SQL>

The coordinator will only request the number of parallel query server processes defined by the degree of parallelism. The physical number of processes that the coordinator will be able to get will depend upon how many are available at the time of operation execution.

So now we know how and where the database gets its degree of parallelism, but at some point we will need to tell what degree of parallelism to use. Therefore it is necessary to understand how best to tell Oracle what the degree of parallelism is defined for a table, index or hint. By following these rules and selecting the ones that relate to your specific operation, you can define your degree of parallelism, just like the pros.

Number of CPUs
Number of parallel server operations
Number of disks used by a table or index
Number of partitions that will be accessed

Suppose you have 8 CPUs, but your information is stored on 12 disks, the limiting factor here is CPUs. The degree of parallelism that we should use would be 8.

Although you may request eight parallel query server processes to process your statement, you may is not receive this many. Due to limits placed on the database, there may not be a sufficient number of parallel query servers available. If you have defined 25 PARALLEL_MAX_SERVERS, then you can only have 25 parallel processes running. If you exceed this number Oracle will not spawn additional processes, resulting in some statements being executed in serial mode by the query coordinator. In the case when you have specified the PARALLEL_MIN_PERCENT and the minimum fraction of parallelism is not available, you will receive an error during execution.

By defining the degree of parallelism for an SQL operation you can profoundly affect the processing of your information. Specify it too high and you may not get processes that you require when you want them. If the value is too low, you may find that your parallel query server processes are not being maximized and result in lost performance.

When considering your degree of parallelism, you should by default define it such that it optimizes a majority of operations to maximize the utilization of your I/O and CPU. In some cases this may not be the best approach, and you may want to override the default degree of parallelism. For example if your operation requires large amounts of I/O you may want to increase the degree of parallelism for the operation. If your operation requires a large amount of memory, then you may want to decrease the degree of parallelism. To override the default degree of parallelism, you may decide to incorporate hints into your SQL.

Now that we can define our degree of parallelism, we need to be able to use our knowledge, but where do we use these facts. By using parallelism in your SQL statements, you can increase the performance of many SQL statements. Let’s move on to looking at some familiar SQL statements, and how they are parallelized.

Summary
To summarize parallelism, its features, how to leverage its power, and using it in real-life situations, inspect the following points as a summary of what we have covered in this presentation.

Ensure that your hardware will support parallel query feature. Single CPU machines will have performance degraded by using parallel features.
The best database candidates for parallel usage are: data warehouses, VLDB and large data intensive applications.
Set-up Parallel Query feature, by setting appropriate values in the initialization parameter file.
Set the initialization parameter file entry PARALLEL_MIN_SERVERS = the likely number of simultaneous parallel operations
Set the initialization parameter file entry PARALLEL_MAX_SERVERS = 2 * # of CPUs * # of concurrent users
Set the initialization parameter file entry OPTIMIZER_PERCENT_PARALLEL = 100/number of concurrent users, this is dependent upon the weight you want to give to parallel processing in the cost-based optimizer.
Generally, you will be required to increase the parameter SHARED_POOL_SIZE when using parallelism.
Use hints in your SQL operations to tell Oracle to parallelize your operation.
Parallel hints may access data and indexes. When data is accessed by a table scan use parallel hint. When an index scan is used, the PARALLEL_INDEX hint may be used.
Don’t exceed the maximum number of parallel query server processes available on your machine, or your operation will not execute in parallel.
The number of parallel query server processes used by an operation is known as the degree of parallelism.
Tables can be created with a default degree of parallelism.
Hints will take precedence over the default degree of parallelism that has been defined for the table.
When creating a table or inserting records based on data in another table use parallel hints in both the insert and select sections of the SQL command.
Create and rebuild indexes using parallel hints to reduce the time to perform this lengthy task.

For a more complete discussion of parallelism, please read Oracle8 Tuning by Abbey, Corey, Dechichio and Abramson, printed by the Oracle Press 1997.