Rolling Forward a Physical Standby Database Using the RECOVER Command
A standby database is a transactionally-consistent copy of the production database. It enables production Oracle database to survive disasters and data corruption. If the production database becomes unavailable because of a planned or an unplanned outage, Data Guard can switch a standby database to the production role, minimizing the downtime associated with the outage. Moreover, performance of production database can be improved by offloading resource-intensive backup and reporting operations to standby systems. As you can see, it’s always desirable to have standby database synchronized with the primary database.
A standby database might lag behind the primary for various reasons like:
- Unavailability of or insufficient network bandwidth between primary and standby database
- Unavailability of Standby database
- Corruption / Accidental deletion of Archive Redo Data on primary
If standby database lags behind the primary database:
- Switchover will take more time.
- Failover will result in data loss.
- An attempt to issue Real Time Query against the standby database will result in error
ORA-03172 : STANDBY_MAX_DATA_DELAY of n seconds exceeded.
Synchronizing the standby and primary databases can be done by copying and applying the archived logs from the primary database but this process is quite time consuming as it will first apply both the COMMITED and the NON COMMITED transactions followed by rolling back uncommitted transactions. Employing incremental backups of the primary database containing changes since the standby database was last refreshed is a faster alternative which will recover the standby database much faster as it will apply only the COMMITED transactions on the standby database. Moreover, incremental backups are also useful in cases when there are missing archived logs on Primary which have not been applied to the standby database
Prior to 12c, in order to roll forward the standby database using incremental backups you would need to:
- Create a control file for the standby database on the primary database.
- Take an incremental backup on the primary starting from the SCN# of the standby database.
- Copy the incremental backup to the standby host and catalog it with RMAN.
- Mount the standby database with newly created standby control file.
- Cancel managed recovery of the standby database and apply incremental backup to the standby database.
- Start managed recovery of standby database.
In 12c, this procedure has been dramatically simplified. Now you can use the RECOVER … FROM SERVICE command to synchronize the physical standby database with the primary database. This command does the following:
- Creates an incremental backup containing the changes to the primary database. All changes to data files on the primary database, beginning with the SCN in the standby data file header, are included in the incremental backup.
- Transfers the incremental backup over the network to the physical standby database.
- Applies the incremental backup to the physical standby database.
This results in rolling forward the standby data files to the same point-in-time as the primary. However, the standby control file still contains old SCN values which are lower than the SCN values in the standby data files. Therefore, to complete the synchronization of the physical standby database, the standby control file needs to be refreshed to update the SCN#.
Now I will demonstrate the steps to refresh the physical standby database with changes made to the primary database using the RECOVER…FROM SERVICE command.
Overview:
- View current configuration and verify that primary database, far sync for primary and standby database are in sync
- Simulate loss of archived logs on the primary database
- Refresh the physical standby using the RECOVER…FROM SERVICE command
Setting up the example
View Current configuration
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
DGMGRL> show configuration; Configuration - drsolution Protection Mode: MaxPerformance Databases: boston - Primary database bostonFS - Far Sync london - Physical standby database london2 - Logical standby database (disabled) londonfs - Far Sync (inactive) Fast-Start Failover: DISABLED Configuration Status: SUCCESS |
It can be verified that primary database (boston), far sync for primary (bostonFS) and physical standby (london) are in sync.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
<span style="color: blue;">-- Primary (boston) --</span> BOSTON>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- <b>190</b> <span style="color: blue;">-- Far Sync for primary (bostonFS) --</span> BOSTONFS>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- <b>190</b> <span style="color: blue;">-- Physical standby (london) --</span> LONDON>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- <b>190</b> |
Simulate loss of archived logs on primary database (boston)
Let’s stop redo transport from primary (boston) and switch logs on primary so that far sync (bostonFS ) and physical standby (london) lag behind primary (boston).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
DGMGRL> show database boston Database - boston Role: PRIMARY Intended State: TRANSPORT-ON Instance(s): boston Database Status: SUCCESS DGMGRL> edit database boston set state='TRANSPORT-OFF'; Succeeded. BOSTON>alter system switch logfile; alter system switch logfile; alter system switch logfile; System altered. |
We can verify that logs are not being transported to far sync bostonFS and standby london. Although sequence# of the latest archived log is 194 on primary, archived logs up to sequence# 191 only have reached far sync and physical standby.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
BOSTON>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- 194 BOSTONFS>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- 191 LONDON>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- 191 |
Next we’ll find out the names of archived logs generated on primary which have not been transported to standby:
1 2 3 4 5 6 7 8 9 |
BOSTON>select sequence#, name from v$archived_log where sequence# >191; SEQUENCE# NAME ---------- ---------------------------------------------------------------- 192 /u01/app/oracle/fast_recovery_area/BOSTON/archivelog/2015_02_09/o1_mf_1_192_bfjgx50j_.arc 193 /u01/app/oracle/fast_recovery_area/BOSTON/archivelog/2015_02_09/o1_mf_1_193_bfjgx6tb_.arc 194 /u01/app/oracle/fast_recovery_area/BOSTON/archivelog/2015_02_09/o1_mf_1_194_bfjgx6yh_.arc |
In a real-time environment, archived logs on primary could be lost due to:
- their deletion if archivelog deletion policy has not configured properly
- their corruption
In our experimental setup, I’ll simulate loss of archived logs on primary by renaming them:
1 2 3 4 5 6 |
BOSTON>ho mv /u01/app/oracle/fast_recovery_area/BOSTON/archivelog/2015_02_09/o1_mf_1_192_bfjgx50j_.arc /home/oracle/arch_192.log BOSTON>ho mv /u01/app/oracle/fast_recovery_area/BOSTON/archivelog/2015_02_09/o1_mf_1_193_bfjgx6tb_.arc /home/oracle/arch_193.log BOSTON>ho mv /u01/app/oracle/fast_recovery_area/BOSTON/archivelog/2015_02_09/o1_mf_1_194_bfjgx6yh_.arc /home/oracle/arch_194.log |
Now even if we restart redo transport from primary, gap in redo logs on far_sync / standby cannot be resolved as some logs are missing on primary.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
DGMGRL> edit database boston set state=<b>'Transport-on'</b>; DGMGRL> show configuration; Configuration - drsolution Protection Mode: MaxPerformance Databases: boston - Primary database Error: ORA-16724: cannot resolve gap for one or more standby databases bostonfs - Far Sync london - Physical standby database london2 - Logical standby database (disabled) londonfs - Far Sync (inactive Fast-Start Failover: DISABLED Configuration Status: <span style="color: red; font-weight: bold;">ERROR</span> DGMGRL> show far_sync bostonFS Far Sync - bostonfs Transport Lag: 3 minutes 43 seconds (computed 1 second ago) Instance(s): bostonFS Far Sync Status: SUCCESS DGMGRL> show database london Database - london Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 3 minutes 53 seconds (computed 0 seconds ago) Apply Lag: 3 minutes 53 seconds (computed 0 seconds ago) Apply Rate: 759.00 KByte/s Real Time Query: OFF Instance(s): london Database Status: SUCCESS |
It can be verified that SCN# (3717618) of standby (london) is lagging behind that (3718999) of the primary (boston)
1 2 3 4 5 6 7 8 9 |
BOSTON>select current_scn from v$database; CURRENT_SCN ----------- 3718999 LONDON>select current_scn from v$database; CURRENT_SCN ---------- 3717618 |
Refresh the physical standby using the RECOVER…FROM SERVICE command
First of all let us identify the datafiles on standby database which are out of sync with respect to primary.
On checking checkpoint_change# in datafile headers on primary (boston) and standby (london), we note that whereas checkpoint_change# of datafiles 5,7,8,9,10,11 match on primary and standby, for rest of the datafiles (1,3,4,6) standby is lagging behind primary.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
BOSTON>select file#, checkpoint_change# from v$datafile; FILE# CHECKPOINT_CHANGE# ---------- ------------------ 1 3717658 3 3717658 4 3717658 <span style="color: blue;">5 1910129</span> 6 3717658 <span style="color: blue;">7 1910129</span> <span style="color: blue;">8 3690127</span> <span style="color: blue;">9 3690127</span> <span style="color: blue;">10 3690127</span> <span style="color: blue;">11 3690127</span> 10 rows selected. LONDON>select file#, checkpoint_change# from v$datafile; FILE# CHECKPOINT_CHANGE# ---------- ------------------ 1 3717619 3 3717619 4 3717619 <span style="color: blue;">5 1910129</span> 6 3717619 <span style="color: blue;">7 1910129</span> <span style="color: blue;">8 3690127</span> <span style="color: blue;"> 9 3690127</span> <span style="color: blue;"> 10 3690127</span> <span style="color: blue;"> 11 3690127</span> 10 rows selected. |
In order to synchronize the standby we will stop the managed recovery processes on the physical standby database and place the physical standby database in MOUNT mode.
1 2 3 4 5 |
LONDON>recover managed standby database cancel; shutdown immediate; startup mount; |
Start RMAN and connect as target to the physical standby database. Refresh the data files on the physical standby database by using an incremental backup of the data files on the primary database.
The following command creates a compressed multi-section incremental backup on the primary database that is then used to refresh the standby data files. boston is the net service name of the primary database that is used to refresh the standby database. The NOREDO clause specifies that the archived redo log files must not be applied during recovery.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
[oracle@host03 ~]$ . oraenv ORACLE_SID = [london] ? [oracle@host03 ~]$ rman target / RMAN> recover database from service boston noredo using compressed backupset section size 100m; Starting recover at 09-FEB-15 <span style="color: blue; font-weight: bold;">Starting implicit crosscheck backup</span> at 09-FEB-15 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=41 device type=DISK Crosschecked 9 objects Finished implicit crosscheck backup at 09-FEB-15 <span style="color: blue; font-weight: bold;">Starting implicit crosscheck copy</span> at 09-FEB-15 using channel ORA_DISK_1 Crosschecked 2 objects Finished implicit crosscheck copy at 09-FEB-15 |
— Catalogues all the existing backups and archivelogs available on standby —
1 2 3 4 5 6 7 8 9 |
searching for all files in the recovery area cataloging files... cataloging done List of Cataloged Files ======================= |
— Backups —
1 2 3 4 |
File Name: /u01/app/oracle/fast_recovery_area/LONDON/autobackup/2015_01_23/o1_mf_s_869760430_bd492tm3_.bkp File Name: /u01/app/oracle/fast_recovery_area/LONDON/autobackup/2015_01_23/o1_mf_s_869760892_bd4b06ws_.bkp ... File Name: /u01/app/oracle/fast_recovery_area/LONDON/autobackup/2015_01_22/o1_mf_s_869676552_bd1qongl_.bkp |
— Archived logs —
1 2 3 4 5 6 |
File Name: /u01/app/oracle/fast_recovery_area/LONDON/archivelog/2015_02_06/o1_mf_1_175_bf919zvx_.arc File Name: /u01/app/oracle/fast_recovery_area/LONDON/archivelog/2015_02_06/o1_mf_1_172_bf8pvrhn_.arc ... ... File Name: /u01/app/oracle/fast_recovery_area/LONDON/archivelog/2015_01_21/o1_mf_1_54_bcyn59o8_.arc File Name: /u01/app/oracle/fast_recovery_area/LONDON/archivelog/2015_01_21/o1_mf_1_60_bczt9p7o_.arc |
— Skips all the datafiles (5, 7, 8, 9, 10, 11) whose checkpoint_change# on standby match that of primary are skipped for restore operation —
1 2 3 4 5 6 7 |
using channel ORA_DISK_1 skipping <b>datafile 5</b>; already restored to SCN 1910129 skipping <b>datafile 7</b>; already restored to SCN 1910129 skipping <b>datafile 8</b>; already restored to SCN 3690127 skipping <b>datafile 9</b>; already restored to SCN 3690127 skipping <b>datafile 10</b>; already restored to SCN 3690127 skipping <b>datafile 11</b>; already restored to SCN 3690127 |
— Restores multi-section compressed backupsets with section size of 100m of all the data files (1, 3, 4, 6) whose checkpoint_change# on standby is behind primary over network. —
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for <b>restore of datafile 00001</b>: /u01/app/oracle/oradata/london/system01.dbf channel ORA_DISK_1: <span style="color: blue; font-weight: bold;">restoring section 1</span> of 9 channel ORA_DISK_1: restore complete, elapsed time: 00:00:03 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for restore of datafile 00001: /u01/app/oracle/oradata/london/system01.dbf channel ORA_DISK_1: <span style="color: blue; font-weight: bold;">restoring section 2</span> of 9 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for restore of datafile 00001: /u01/app/oracle/oradata/london/system01.dbf ... ... channel ORA_DISK_1: <span style="color: blue; font-weight: bold;">restoring section 9</span> of 9 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for <span style="color: blue; font-weight: bold;">restore of datafile 00003</span>: /u01/app/oracle/oradata/london/sysaux01.dbf channel ORA_DISK_1: <b>restoring section 1</b> of 10 channel ORA_DISK_1: restore complete, elapsed time: 00:00:03 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for restore of datafile 00003: /u01/app/oracle/oradata/london/sysaux01.dbf ... ... channel ORA_DISK_1: <b>restoring section 10</b> of 10 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for <span style="color: blue; font-weight: bold;">restore of datafile 00004</span>: /u01/app/oracle/oradata/london/undotbs01.dbf channel ORA_DISK_1: <b>restoring section 1</b> of 3 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for restore of datafile 00004: /u01/app/oracle/oradata/london/undotbs01.dbf ... ... channel ORA_DISK_1: <b>restoring section 3</b> of 3 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 channel ORA_DISK_1: starting incremental datafile backup set restore channel ORA_DISK_1: using compressed network backup set from service boston destination for <span style="color: blue; font-weight: bold;">restore of datafile 00006</span>: /u01/app/oracle/oradata/london/users01.dbf channel ORA_DISK_1: <b>restoring section 1</b> of 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:01 Finished recover at 09-FEB-15 |
As soon as standby is recovered, redo transport to far sync (bostonFS) and standby (london) is resumed.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
BOSTON>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- <b>197</b> BOSTONFS>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- <b>197</b> LONDON>select max(sequence#) from v$archived_log; MAX(SEQUENCE#) -------------- <b>197</b> |
The RECOVER … FROM SERVICE command refreshes the standby data files and rolls them forward to the same point-in-time as the primary. However, the standby control file still contains old SCN values which are lower than the SCN values in the standby data files. Therefore, to complete the synchronization of the physical standby database, we must refresh the standby control file to update the SCN#.
Hence place the physical standby database in NOMOUNT mode and restore controlfile to standby by using the control file on the primary database.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
LONDON>SHUTDOWN IMMEDIATE; STARTUP NOMOUNT; RMAN> RESTORE STANDBY CONTROLFILE FROM SERVICE boston; Starting restore at 09-FEB-15 using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=22 device type=DISK channel ORA_DISK_1: starting datafile backup set restore channel ORA_DISK_1: using network backup set from service boston channel ORA_DISK_1: restoring control file channel ORA_DISK_1: restore complete, elapsed time: 00:00:07 output file name=/u01/app/oracle/oradata/london/control01.ctl output file name=/u01/app/oracle/fast_recovery_area/london/control02.ctl Finished restore at 09-FEB-15 |
Mount the standby database.
1 |
LONDON>ALTER DATABASE MOUNT; |
Now the standby database has been rolled forward successfully as can be verified using DGMGRL.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
DGMGRL> show configuration; Configuration - drsolution Protection Mode: MaxPerformance Databases: boston - Primary database bostonfs - Far Sync london - Physical standby database london2 - Logical standby database (disabled) londonfs - Far Sync (inactive) Fast-Start Failover: DISABLED Configuration Status: SUCCESS |
Note: If network resources are a constraint, then you can still use the Pre-12c BACKUP INCREMENTAL command to create incremental backups on the primary database, and then use the incremental backups to roll forward the physical standby database.
Conclusion:
In case standby database lags behind the primary, it can be rolled forward using incremental backups from primary database. It is a faster alternative to copying archived logs from primary and applying them on standby database. When an archive log gap is unrecoverable this procedure is a much faster alternative to rebuilding the standby database. This used to be a long manual process till 11g. In 12c, datafiles from primary database can be restored over the network as backupset and they can be optionally compressed and encrypted as well.
References:
https://docs.oracle.com/database/121/BRADV/rcmadvre.htm#BRADV685