Oracle Data Pump was introduced in Oracle Database 10g to enable very high-speed transfer of data and metadata between databases. A salient feature of Data Pump is that it can parallelize the export and import jobs for maximum performance.

However, prior to Oracle Database 11g release 2 (11.2), if a Data Pump job is executed in parallel on an instance of an Real Application Cluster (RAC) database, the parallelism is employed only on that instance without utilizing the potentially valuable idle resources of other instances.

From Oracle Database 11.2 onwards, parallelism is no longer confined to a single instance. Rather worker processes can be distributed across multiple Oracle RAC instances to better utilize Oracle RAC resources. The newly-introduced client parameter CLUSTER can be employed to specify whether you want to use the cluster resources. When CLUSTER=Y (default) is specified, Data Pump attempts to use all active Oracle RAC instances.

To exercise more control over the instances on which the job should run, the SERVICE_NAME parameter can be mentioned where SERVICE_NAME must be a valid known service of the database. In this case, the Data Pump job will run only on those instance where the service is available.

If any Oracle RAC instance where the job that is running dies or leaves the cluster, the job aborts and can be restarted at some future time.

In this article, I will demonstrate:

  • Case-I: Parallelization of an export job across all active Oracle RAC instances
  • Case-II: Parallelization of an export job across a subset of Oracle RAC instances

Current scenario:

  • Name of the cluster: cluster01
  • Number of nodes: 3 (host01, host02, host03)
  • RAC Database version: 11.2.0.3
  • Name of RAC database: orcl
  • Number of instances: 3
  • Names of instances: orcl1, orcl2, orcl3

In order that the Data Pump job is distributed across multiple instances of an RAC database, the following prerequisites should be met:

  • The directory object should point to shared storage that is accessible by any Oracle RAC instances designated to run Data Pump worker processes.
  • The PARALLEL parameter should be more than 1 and less than or equal to the number of files in the dump file set.
  • To specify enough dump files by specifying DUMPFILE parameter, the preferred approach is to use the wildcard option by using a substitution variable (%U) in the filename.
  • LOGFILE parameter should point to a location on the local file system. Log file is created on the node hosting the instance where master process is running.

Demonstration

Case-I: Export job distributed across all active Oracle RAC instances

  • First, let’s create a directory object which points to shared storage and is accessible by all the three instances of the database:
SQL>drop directory dp_shared_dir;

SQL>create directory DP_SHARED_DIR as '+DATA/orcl/';

SQL>grant read, write on directory dp_shared_dir to public;
  • Issue the command to export SH schema in parallel across all active Oracle RAC instances:
  • Degree of parallelism = 6
  • Dumpfile – Create in the newly created directory DP_SHARED_DIR with names expsh%U.dmp beginning with 01 for %U, then using 02, 03, and so on.
  • Logfile – Create in the location on local file system pointed to by the directory object DATA_PUMP_DIR
[oracle@host01 root]$ expdp system/oracle@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile='expsh%U.dmp' reuse_dumpfiles=y

Export: Release 11.2.0.3.0 - Production on Thu Dec 3 12:49:58 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Starting "SYSTEM"."SYS_EXPORT_SCHEMA_06": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y
Estimate in progress using BLOCKS method...
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 273.8 MB
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA

. . exported "SH"."CUSTOMERS" 9.853 MB 55500 rows

. . exported "SH"."COSTS":"COSTS_Q2_1998" 79.52 KB 2397 rows

. . exported "SH"."COSTS":"COSTS_Q2_1999" 132.5 KB

….
  • Verify that the Data Pump job is distributed across 6 worker processes on all the three instances of the RAC database:
SQL> select inst_id, session_type from dba_datapump_sessions;

   INST_ID SESSION_TYPE

---------- --------------

         3 WORKER
         3 WORKER
         2 WORKER
         2 WORKER
         1 DBMS_DATAPUMP
         1 MASTER
         1 WORKER
         1 WORKER
  • While the export is running, let us abort the database instance on host01 to verify that the Data Pump job can be restarted in case any instance where the job is running aborts.
[root@host01 log]# ps -ef |grep pmon
grid 5597 1 0 08:25 ? 00:00:00 asm_pmon_+ASM1
oracle 27612 1 0 09:46 ? 00:00:00 ora_pmon_orcl1
root 30103 8495 0 11:39 pts/1 00:00:00 grep pmon

[root@host01 log]# kill -9 27612

As the instance orcl1 has crashed, the Data Pump job aborts.

……

. . exported "SH"."SALES":"SALES_Q2_1999" 1.754 MB 54233 rows

. . exported "SH"."COSTS":"COSTS_Q1_1999" 183.5 KB 5884 rows

. . exported "SH"."COSTS":"COSTS_Q1_1998" 139.5 KB 4411 rows

. . exported "SH"."SALES":"SALES_Q2_2000" 1.802 MB 55515 rows

UDE-03113: operation generated ORACLE error 3113
ORA-03113: end-of-file on communication channel
Process ID: 20597
Session ID: 51 Serial number: 233

UDE-03114: operation generated ORACLE error 3114
ORA-03114: not connected to ORACLE
  • Let us restart the job by ATTACHing to the job using the name of the job (as identified in expdp command output) as a parameter to the expdp command:
[oracle@host01 root]$ expdp system/oracle@orcl ATTACH=SYS_EXPORT_SCHEMA_06

Export: Release 11.2.0.3.0 - Production on Thu Dec 3 12:53:06 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

Job: SYS_EXPORT_SCHEMA_06
	Owner: SYSTEM
	Operation: EXPORT
	Creator Privs: TRUE
	GUID: 25FA37F1818A5075E053B7C909C0116F
	Start Time: Thursday, 03 December, 2015 12:53:11
	Mode: SCHEMA
	Instance: orcl3
	Max Parallelism: 6
	EXPORT Job Parameters:
	Parameter Name Parameter Value:
	CLIENT_COMMAND system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y 

	State: IDLING
	Bytes Processed: 27,664,048
	Percent Done: 23
	Current Parallelism: 6
	Job Error Count: 0
	Dump File: +DATA/orcl//expsh%u.dmp
	Dump File: +DATA/orcl//expsh01.dmp
		bytes written: 4,096
	Dump File: +DATA/orcl//expsh02.dmp
		bytes written: 27,385,856
	Dump File: +DATA/orcl//expsh03.dmp
		bytes written: 147,456
	Dump File: +DATA/orcl//expsh04.dmp
		bytes written: 192,512
	Dump File: +DATA/orcl//expsh05.dmp
		bytes written: 4,096
	Dump File: +DATA/orcl//expsh06.dmp
		bytes written: 4,096

Worker 1 Status:
	Process Name: DW00
	State: UNDEFINED
	Object Schema: SH
	Object Type: SCHEMA_EXPORT/DEFAULT_ROLE
	Completed Objects: 1
	Total Objects: 1
	Worker Parallelism: 1

Worker 2 Status:
	Process Name: DW00
	State: UNDEFINED
	Object Schema: SH
	Object Name: COSTS
	Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA
	Completed Objects: 1
	Total Objects: 66
	Worker Parallelism: 1

Worker 3 Status:
	Process Name: DW00
	State: UNDEFINED

Worker 4 Status:
	Process Name: DW03
	State: UNDEFINED

Worker 5 Status:
	Process Name: DW01
	State: UNDEFINED
	Object Schema: SH
	Object Name: COSTS
	Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA
	Completed Objects: 1
	Total Objects: 66
	Worker Parallelism: 1

Worker 6 Status:
	Process Name: DW01
	State: UNDEFINED

Export>
  • We will restart the job with a different degree of parallelism, say 4 (earlier it was 6):
Export> parallel=4

Export> START_JOB

It can be seen that four worker processes have been started on instances 2 and 3 as instance 1 has been aborted.

SQL> select inst_id, session_type from dba_datapump_sessions;

   INST_ID SESSION_TYPE
---------- --------------
         3 DBMS_DATAPUMP
         3 MASTER
         3 WORKER
         3 WORKER
         3 WORKER
         2 WORKER

6 rows selected.

To show progress again, let us issue the CONTINUE_CLIENT command:

Export> continue_client

Job SYS_EXPORT_SCHEMA_06 has been reopened at Thursday, 03 December, 2015 12:53
Restarting "SYSTEM"."SYS_EXPORT_SCHEMA_06": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y
. . exported "SH"."COSTS":"COSTS_Q1_2000" 120.6 KB 3772 rows
. . exported "SH"."COSTS":"COSTS_Q1_2001" 227.8 KB 7328 rows
. . exported "SH"."SALES":"SALES_Q3_2000" 1.909 MB 58950 rows
. . exported "SH"."SALES":"SALES_Q2_2001" 2.051 MB 63292 rows

. . .

. . .

. . exported "SH"."SALES":"SALES_Q4_2002" 0 KB 0 rows
. . exported "SH"."SALES":"SALES_Q4_2003" 0 KB 0 rows
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
. . exported "SH"."SALES":"SALES_Q3_1998" 1.633 MB 50515 rows
Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/BITMAP_INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/BITMAP_INDEX/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/INDEX/DOMAIN_INDEX/INDEX
Processing object type SCHEMA_EXPORT/MATERIALIZED_VIEW
Processing object type SCHEMA_EXPORT/DIMENSION
Master table "SYSTEM"."SYS_EXPORT_SCHEMA_06" successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_06 is:
	+DATA/orcl//expsh01.dmp
	+DATA/orcl//expsh02.dmp
	+DATA/orcl//expsh03.dmp
	+DATA/orcl//expsh04.dmp
Job "SYSTEM"."SYS_EXPORT_SCHEMA_06" successfully completed at 12:56:29

Thus, a Data Pump job:

  • Can be parallelized across all the available instances of a RAC database.
  • Restarted in future if any instance, where the job is running, aborts.

Case-II: Parallelization of export job across a subset of Oracle RAC instances

In case we want to run the job against a subset of instances, we can specify the SERVICE_NAME parameter which points to a service of the database.

  • In order to demonstrate this, let us first create and start a service called orcls for the database with preferred instances orcl1, orcl2 and available instance orcl3. It can be observed that the newly-created service orcls is currently available on instances orcl1 and orcl2.
[oracle@host01 root]$ srvctl add service -s orcls -d orcl -r orcl1,orcl2 -a orcl3

[oracle@host01 root]$ srvctl status service -s orcls -d orcl

Service orcls is not running.

[oracle@host01 root]$ srvctl start service -s orcls -d orcl

[oracle@host01 root]$ srvctl status service -s orcls -d orcl

Service orcls is running on instance(s) orcl1,orcl2
  • Let us once again export the SH schema with a parallelism of 6 as earlier, but this time I am going to specify an additional parameter of SERVICE_NAME on the command line with points to service orcls, which is currently running on orcl1 and orcl2.
[oracle@host01 root]$ expdp system/oracle@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile='expsh%U.dmp' reuse_dumpfiles=y service_name=orcls

Export: Release 11.2.0.3.0 - Production on Fri Dec 4 14:48:22 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Starting "SYSTEM"."SYS_EXPORT_SCHEMA_09": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y service_name=orcls
Estimate in progress using BLOCKS method...
Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 273.8 MB
Processing object type SCHEMA_EXPORT/USER
Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
Processing object type SCHEMA_EXPORT/ROLE_GRANT
Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
. . exported "SH"."COSTS":"COSTS_Q1_1998" 139.5 KB 4411 rows
. . exported "SH"."COSTS":"COSTS_Q2_1998" 79.52 KB 2397 rows
. . exported "SH"."COSTS":"COSTS_Q2_1999" 132.5 KB 4179 rows

…..
  • While the job is running, it can be verified that this time parallelism has been constrained across instances orcl1 and orcl2 only where the service is currently available.
SQL> select inst_id, session_type from dba_datapump_sessions;

   INST_ID SESSION_TYPE
---------- --------------
         1 WORKER
         1 WORKER
         2 DBMS_DATAPUMP
         2 MASTER
         2 WORKER
         2 WORKER
         2 WORKER
         2 WORKER

8 rows selected.
  • Abort the RAC database instance on one of the nodes, say host01, where Data Pump export is currently running:
 [root@host01 ~]# ps -ef |grep pmon
grid      5598    1   0 13:07 ?         00:00:00 asm_pmon_+ASM1
oracle    6157    1   0 13:08 ?         00:00:00 ora_pmon_orcl1
root      7178 6735   0 13:14 pts/1     00:00:00 grep pmon

[root@host01 ~]# kill -9 6157
  • As the instance aborts, the export also gets terminated:
. . exported "SH"."COSTS":"COSTS_Q2_1999" 132.5 KB 4179 rows
. . exported "SH"."COSTS":"COSTS_Q2_2000" 119.0 KB 3715 rows
. . exported "SH"."CUSTOMERS" 9.853 MB 55500 rows
. . exported "SH"."COSTS":"COSTS_Q2_2001" 184.5 KB 5882 rows

UDE-03113: operation generated ORACLE error 3113
ORA-03113: end-of-file on communication channel
Process ID: 9617
Session ID: 52 Serial number: 5

UDE-03114: operation generated ORACLE error 3114
ORA-03114: not connected to ORACLE
    • Restart the export by ATTACHing to the job by specifying its name:

 

[oracle@host01 root]$ expdp system/oracle@orcl ATTACH=SYS_EXPORT_SCHEMA_09

Export: Release 11.2.0.3.0 - Production on Fri Dec 4 14:49:10 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

Job: SYS_EXPORT_SCHEMA_09
	Owner: SYSTEM
	Operation: EXPORT
	Creator Privs: TRUE
	GUID: 260EFD3967B32591E053B7C909C0C0D4
	Start Time: Friday, 04 December, 2015 14:49:19
	Mode: SCHEMA
	Instance: orcl1
	Max Parallelism: 6
	EXPORT Job Parameters:
	Parameter Name Parameter Value:
		CLIENT_COMMAND system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y service_name=orcls 
	State: IDLING
	Bytes Processed: 11,277,592
	Percent Done: 4
	Current Parallelism: 6
	Job Error Count: 0
	Dump File: +DATA/orcl//expsh%u.dmp
	Dump File: +DATA/orcl//expsh01.dmp
		bytes written: 4,096
	Dump File: +DATA/orcl//expsh02.dmp
		bytes written: 10,473,472
	Dump File: +DATA/orcl//expsh03.dmp
		bytes written: 827,392
	Dump File: +DATA/orcl//expsh04.dmp
		bytes written: 4,096
	Dump File: +DATA/orcl//expsh05.dmp
		bytes written: 4,096

Worker 1 Status:
	Process Name: DW00
	State: UNDEFINED
	Object Schema: SH
	Object Type: SCHEMA_EXPORT/DEFAULT_ROLE
	Completed Objects: 1
	Total Objects: 1
	Worker Parallelism: 1

Worker 2 Status:
	Process Name: DW00
	State: UNDEFINED
	Object Schema: SH
	Object Name: COSTS
	Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA
	Completed Objects: 1
	Total Objects: 66
	Worker Parallelism: 1

Worker 3 Status:
	Process Name: DW02
	State: UNDEFINED

Worker 4 Status:
	Process Name: DW01
	State: UNDEFINED
	Object Schema: SH
	Object Name: COSTS
	Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA
	Completed Objects: 1
	Total Objects: 66
	Worker Parallelism: 1

Worker 5 Status:
	Process Name: DW04
	State: UNDEFINED

Worker 6 Status:
	Process Name: DW02
	State: UNDEFINED
	Object Schema: SH
	Object Name: COSTS
	Object Type: SCHEMA_EXPORT/TABLE/TABLE_DATA
	Completed Objects: 1
	Total Objects: 66
	Worker Parallelism: 1

Export> start_job

Export> continue_client

Job SYS_EXPORT_SCHEMA_09 has been reopened at Friday, 04 December, 2015 14:49 
Restarting "SYSTEM"."SYS_EXPORT_SCHEMA_09": system/********@orcl schemas=sh directory=dp_shared_dir parallel=6 cluster=y logfile=data_pump_dir:expsh.log dumpfile=expsh%U.dmp reuse_dumpfiles=y service_name=orcls
Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
. . exported "SH"."COSTS":"COSTS_Q1_1999" 183.5 KB 5884 rows
. . exported "SH"."COSTS":"COSTS_Q4_1998" 144.7 KB 4577 rows
. . exported "SH"."COSTS":"COSTS_Q4_1999" 159.0 KB 5060 rows
. . exported "SH"."COSTS":"COSTS_Q4_2000" 160.2 KB 5088 rows
. . exported "SH"."COSTS":"COSTS_Q4_2001" 278.4 KB 9011 rows
……
….
….

Processing object type SCHEMA_EXPORT/TABLE/TABLE
Processing object type SCHEMA_EXPORT/TABLE/GRANT/OWNER_GRANT/OBJECT_GRANT
Processing object type SCHEMA_EXPORT/TABLE/COMMENT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/VIEW/VIEW
Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
Processing object type SCHEMA_EXPORT/TABLE/INDEX/BITMAP_INDEX/INDEX
Processing object type SCHEMA_EXPORT/TABLE/INDEX/STATISTICS/BITMAP_INDEX/INDEX_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/STATISTICS/TABLE_STATISTICS
Processing object type SCHEMA_EXPORT/TABLE/INDEX/DOMAIN_INDEX/INDEX
Processing object type SCHEMA_EXPORT/MATERIALIZED_VIEW
Processing object type SCHEMA_EXPORT/DIMENSION
Master table "SYSTEM"."SYS_EXPORT_SCHEMA_09" successfully loaded/unloaded
******************************************************************************
Dump file set for SYSTEM.SYS_EXPORT_SCHEMA_09 is:
	+DATA/orcl//expsh01.dmp
	+DATA/orcl//expsh02.dmp
	+DATA/orcl//expsh03.dmp
	+DATA/orcl//expsh04.dmp
	+DATA/orcl//expsh05.dmp
	+DATA/orcl/expsh06.dmp
Job "SYSTEM"."SYS_EXPORT_SCHEMA_09" successfully completed at 14:50:49

Thus, a Data Pump job:

  • Can be parallelized across a subset of the available instances of a RAC database by specifying the SERVICE_NAME parameter.
  • Restarted in future if any instance, where the job is running, aborts.

Note: Since there is some overhead involved with distributing Data Pump processes across an Oracle RAC environment, there will be cases where there is no performance benefit. In those cases, you can

specify CLUSTER=N to force all Data Pump processes to run only on the instance where the job is started.

Summary:

  • Prior to Oracle Database 11g release 2 (11.2), if a Data Pump job is executed in parallel on an instance of a RAC database, the parallelism is employed only on that instance without utilizing the potentially valuable idle resources of other instances.
  • From Oracle Database 11.2 onwards, CLUSTER parameter can be employed to distribute the worker processes across multiple Oracle RAC instances to better utilize Oracle RAC resources.
  • To exercise more control over the instances on which the job should run, SERVICE_NAME parameter can be specified so that the Data Pump job runs only on those instances where the service is available.
  • If any Oracle RAC instance where the job is running dies or leaves the cluster, the job aborts and can be restarted at some future time.
  • For more information, Oracle have a white paper on the subject.

Tags: ,