Contents:
An overview
Note the following requirements for maintaining a standby database:
On the Primary database (it should be in ARCHIVE LOG MODE):
1. Ensure that your primary database is in archive log mode
2. ALTER SYSTEM ARCHIVE LOG CURRENT
(insures consistency in backup, standby controlfile, and logfiles)
3. Backup your primary database (backup only datafiles, not redo logfiles or control files. Cold backup is easier.
4. ALTER DATABASE CREATE STANDBY CONTROLFILE AS '$ORACLE_BASE/admin/$ORACLE_SID/pfile/ctrlfile.bin';
5. Modify the init.ora files on the Production Database to know
where to write the archived redo log files on both primary and the standby
server. Here is a brief example of a part of it:
# THIS PARAMETER IS NO LONGER USED
# log_archive_dest=/opt/dbms3/app/oracle/admin/dtmprod1/arch
# LOG_ARCHIVE_DEST_1 is a local filesystem location on the production system.
# The MANDATORY flag means that the archive to this destination MUST SUCCEED
(you can use OPTIONAL, specifically if standby is in read only mode))
log_arch_start = true
log_archive_format = arch_%t_%s.arc
log_archive_dest_1 = "location=/u01/oradata/TICPBT09/ARCH MANDATORY"
log_archive_dest_state_1 = enable
# LOG_ARCHIVE_DEST_2 is the service name of a remote database accessible
via Net8
# The REOPEN parameter specifies that if the standby is unreachable for
# some reason, the arch process will attempt to reconnect in 60 seconds.
log_archive_dest_2 = "service=stdby.world mandatory reopen=60"
log_archive_dest_state_2 = enable
# Tell Oracle both destinations MUST succeed in order to mark the online
redo ready for reuse.
log_archive_min_succeed_dest = 2
# Define the max number of ARCH processes available to archive inactive
logs.
log_archive_max_processes=10
5. Copy the datafiles and archived redo logs created in step 3, the standby control file created in step 4 from the primary to the standby host (be sure to place copies of this file with the appropriate name in the equivalent locations on the standby where the production control files would have existed, if you had copied them. Otherwise you will need to use the parameters in the standby database to change the directories). Make sure you do not copy the production control files and the production init.ora files (you will copy the *.ora files later).
6. Copy the primary init.ora file to the standby server and make
modifications for standby database. I recommend keeping the parameters the
same as configured on the primary database, with the following modifications
to the secondary server init.ora file:
# ensure that your standby database
# is pointing at the standby control
# file you created
# and copied to the standby server
control_files = '/ora01/oradata/BRDSTN/stbycf.ctl'
# location where archive redo logs are
# being written in standby environment
standby_archive_dest= /ora01/oradata/BRDSTN
# Enable archive gap management
fal_client=standby1
fal_server=primary1
fal_client and fal_server are new parameters that enable archive-gap management.
In this example, standby1 is the Oracle Net name of the standby database
and primary1 is the Oracle Net name of the primary database. The fetch archive
log (FAL) background processes reference these parameters to determine the
location of the physical-standby and primary databases.
7. Configuring the Network
In managed-recovery mode, the primary and standby databases need to be able
to communicate with each other via Oracle Net. There needs to be a listener
on both the primary and standby servers, and Oracle Net must be able to connect
from one database to the other.
Primary listener.ora file. The primary database needs a listener listening
for incoming requests. The following text describes the listener for the
primary listener.ora file:
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS =
(PROTOCOL = TCP)(HOST = primary_host)(PORT = 1521))
)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY =
EXTPROC))
)))
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /ora01/app/oracle/product/9.2.0)
(PROGRAM = extproc)
)
(SID_DESC =
(SID_NAME = BRDSTN)
(ORACLE_HOME = /ora01/app/oracle/product/9.2.0))
Primary tnsnames.ora file. I recommend placing both the primary and
standby service locations in the tnsnames.ora file on both the primary and
standby servers. This makes troubleshooting easier and also makes failover
and switchover operations smoother. In this example, primary1 points to the
primary database on a host named primary_host, and standby1 points to the
standby database on a server named standby_host.
The following text describes the connection for the primary tnsnames.ora
file:
primary1 =
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp) (PORT=1521) (HOST=primary_host))
(CONNECT_DATA=(SERVICE_NAME=BRDSTN)))
standby1 =
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp) (PORT=1521) (HOST=standby_host))
(CONNECT_DATA=(SERVICE_NAME=BRDSTN)))
Standby listener.ora file. The standby database needs to be able to
service incoming Oracle Net communication from the primary database. The
following text describes the listener for the standby server listener.ora
file:
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = standby_host)(PORT
= 1521))
)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY
= EXTPROC))
)))
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /ora01/app/oracle/product/9.2.0)
(PROGRAM = extproc)
)
(SID_DESC =
(SID_NAME = BRDSTN)
(ORACLE_HOME = /ora01/app/oracle/product/9.2.0))
Standby tnsnames.ora file. The following is the same connection information
entered into the primary server tnsnames.ora file:
primary1 =
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp) (PORT=1521) (HOST=primary_host))
(CONNECT_DATA=(SERVICE_NAME=BRDSTN)))
standby1 =
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp) (PORT=1521) (HOST=standby_host))
(CONNECT_DATA=(SERVICE_NAME=BRDSTN)))
When you operate Oracle Data Guard in managed-recovery mode, it is critical
to have Oracle Net connectivity between the primary and standby databases.
On the Standby database:
A very unsophisticated script to start sustained recovery might be:
#vi rcov.script
svrmgrl << END
connect internal
startup nomount pfile='/xx/yy/initSID.ora'
alter database mount standby database;
alter database recover managed standby database;
exit;
END
and this could be executed.
# nohup rcov.script 1>/tmp/rcov.log &
1. This was done previously.
Copy the init.ora file from the prod site to the standby place, and modify
the initSID.ora as requiered. Partial initSID.ora is shown
db_name
= TICPBT09 (as the prod site)
instance_name = TICPBT09
service_names = TICPBP02
# Control Files that will be used on the Standby
control_files = path/name of the file generated as crtl in the primary
database
# Setup location to store archived redo logs sent to this database from
production.
#The standby parameter should be the same as the log_arch_dest parameter
standby_archive_dest = /backup/TICPBP02
log_archive_format = arch_%t_%s.arc
# LOG_ARCHIVE_DEST_1 will be used to store archived logs generated by this
# standby database if it is activated (becomes the production database.
#The log_arch_dest parameter should be the same as the standby parameter
log_archive_dest_1 = "location=/backup/TICPBP02 MANDATORY"
log_archive_dest_state_1 = enable
# The destination MUST succeed in order to mark the online redo ready for
reuse.
log_archive_min_succeed_dest = 1
# Convert all datafile pathnames that contain the names of your databases.
As you can see this works ONLY if you have all your files in one place. If
you have them in different directories (as OFA recommends), then you don't
need use this parameter, you MUST rename each one of the files after mounting
the database with the command ALTER DATABASE RENAME FILE '/oldplace/filename'
TO '/newplace/filename' ;
db_file_name_convert = "/path/from/prod/db","/path/in/stdby/db"
log_file_name_convert = "/path/from/prod/db","/path/in/stdby/db"
# Define the max number of ARCH processes available to archive inactive
logs.
log_archive_max_processes=10
Suggestion: Increase DB_BLOCK_BUFFER and RECOVERY_PARALLELISM to speed
up the recovery process.
2. Start the standby database in recovery mode.
STARTUP NOMOUNT (PFILE=/full/path/to/init<SID>.ora
)
ALTER DATABASE MOUNT STANDBY DATABASE;
If necessay perform : ALTER DATABASE RENAME FILE '/oldplace/filename' TO '/newplace/filenam' ;
Now the DBA can put the database into sustained recovery mode:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE;
To use "manual" recovery:
ALTER DATABASE RECOVER STANDBY DATABASE;
The server process applies the archive logs as they are generated on the standby host. The server process wakes itself internally every 15 seconds to check for archival of the next required log.
At this point, you should be able to generate archive redo information
on your primary database and see it applied to the standby. As you deploy
your Oracle Data Guard environment and perform maintenance operations, pertinent
messages are often written to the primary and standby alert.log files. I've
found that concepts jell more quickly and issues are resolved more rapidly
when I simultaneously monitor the contents of both the standby and primary
alert.log files. In a UNIX environment, the tail command is particularly
helpful in this regard:
$ tail -f alert_BRDSTN.log
Most of the problems you'll encounter with your Oracle Data Guard physical-standby
database setup will be because either Oracle Net was not configured correctly
or you fat-fingered an initialization parameter. The alert logs are usually
pretty good at pointing out the source of the problem.
You can can also specify a timeout period (minutes) for sustained recovery.
If no new logs are generated within this period, the recovery session is cancelled.
SVRMGR> recover managed standby database timeout 10
In order to cancel a sustained recovery session, from another server manager
session issue the following command:
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
The database can be opened (read-only) using:
ALTER DATABASE OPEN READ ONLY;
Note: At this stage the standby database can be used for reporting type queries
But is NOT in recovery mode, so then you will need to apply the archive log
files.
3. Restart the Production Database.
4. Verify Production to Standby Transfer Operations.
Finally, validate that the production database is able to automatically transmit
logs to the standby database and that the standby is ready to receive them.
To make sure that the production database is transmitting the logs, execute
the following command against the production database:
ALTER SYSTEM SWITCH LOGFILE;
When sufficient time has passed for the production database to transport
the archived redo log, perform this query on the production database:
select * from v$archive_dest;
Find the row in the output that corresponds to your standby database, and
verify that the values for BINDING, REOPEN_SECS, and DESTINATION are correct,
based on your requirements. Make sure that the value for STATUS is VALID—if
not, check the FAIL_DATE, FAIL_SEQ, and ERROR columns for the error.
Next, verify that the directory specified in standby_ archive_dest init.ora
parameter on the standby database contains the appropriate archived
redo log that should have been copied from production. Make a note of the
log's sequence number.
Check the last entries of the standby's alert log for information about the
log received by the standby database. Verify that the log-sequence number
for which the standby was waiting matches the sequence of the actual archived
log in the standby_ archive_dest directory (the entry appears as "Media Recovery
Waiting for thread n seq# xyz").
Finally, make sure that that Media Recovery applied the log the entry appears
similar to "Media Recovery Log /pathname/logxyz_ n.arc in the alert.log.
Resolving Problems
In a managed-recovery configuration, if the primary log-transport mechanism
isn't able to ship the redo data from the primary database to the physical-standby
database, you have a gap between what archive logs have been generated by
the primary database and what have been applied to the physical-standby database.
This situation can arise if there is a network problem or if the standby
database is unavailable for any reason.
Detecting and resolving archive gaps is crucial to maintaining high availability.
An untimely gap-resolution mechanism compromises the availability of data
in the standby environment. For example, if you have an undetected gap and
disaster strikes, the result is a loss of data after you fail over to your
standby database.
In Oracle8i, you had the responsibility of manually monitoring gaps. After
you detected a gap, you then had to manually perform the following steps
on your standby database:
With Oracle9i Database, Data Guard has two mechanisms for automatically resolving
gaps. The first mechanism for gap resolution uses a periodic background communication
between the primary database and all of its standby databases. Data Guard
compares log files generated on the primary database with log files received
on the standby databases to compute the gaps. Once a gap is detected, the
missing log files are shipped to the appropriate standby databases. This
automatic method does not need any parameter configuration.
The second mechanism for automatic gap resolution uses two new Oracle9i Database
initialization parameters on your standby database:
SQL> alter system set FAL_CLIENT = standby1;
SQL> alter system set FAL_SERVER = primary1;
Once these parameters are set on the standby database, the managed-recovery
process in the physical-standby database will automatically check and resolve
gaps at the time redo is applied.
Data Guard SQL Apply
Prior to Oracle9i Database Release 2, you could implement only a physical-standby
database. One aspect of a physical-standby database is that it can be placed
in either recovery mode or read-only mode, but not both at the same time.
With the new Data Guard SQL Apply feature of Oracle9i Database Release 2,
you can now create a logical-standby database that is continuously available
for querying while simultaneously applying transactions from the primary.
This is ideal if your business requires a near-real-time copy of your production
database that is also accessible 24/7 for reporting.
A logical-standby database differs architecturally from a physical-standby
database, which is physically identical to the primary database block-for-block,
because the Oracle media-recovery mechanism is used on the redo information
received from the primary database to apply those redo changes to a physical
data-block address.
Unlike a physical-standby database, a logical-standby database transforms
the redo data received from the primary into SQL statements (using LogMiner
technology) and then applies the SQL statements. It is logically identical
to the primary database but not physically identical at the block level.
Just like a physical-standby database, a logical-standby database can assume
the role of the primary database in the event of a disaster. You can configure
your Data Guard environment to use a combination of both logical-standby
and physical-standby databases. If you use your logical-standby database
for reporting, you can further tune it by adding indexes and materialized
views independent of the primary.
There are many steps involved with setting up a logical-standby database.
The best sources for detailed instructions are MetaLink note 186150.1 and
the Oracle9i Database Release 2 Oracle Data Guard Concepts and Administration
manual
How to open the standby database
in Read only mode.
Disadvantage is that the (sustained) recovery has to be cancelled. If the
database is opened in read only mode and users (including System and Sys)
need sorting, a locally managed sort tablespace should be their default
sort tablespace. Steps to open the standby database in read only mode:
1- Create a locally managed temporary tablespace (if you created before
the backup go to step #3)
The new locally managed temporary tablespace is created on the primary
database and is propagated to the standby database by applying the
logs or can be created before the backup for the standby is made.
CREATE TEMPORARY TABLESPACE
temp_local TEMPFILE
'/unix1/app/oracle/oradata/v816/oradata/v816/temp_local01.dbf'
SIZE 100M EXTENT MANAGEMENT LOCAL UNIFORM SIZE 10M;
2- Be sure to set all the users that are going to make sorts in the standby
DB (including sys and system) to have their temporary tablespace set to this
locally managed temporary tablespace (alter user .. temporary tablespace
temp_local;) , do this on the primary DB and have this propagated to
the standby by the archives. Check carefully the location of the directories
in both places. "Send" the archive log:
ALTER SYSTEM ARCHIVE LOG CURRENT
3- Cancel the sustained recovery
Alter database recover managed standby database cancel;
4- Open the database in readonly mode
Alter database open read only;
5- Add a temporary file at the standby database to the locally managed
temporary tablespace
V$datafile or sys.file$ on the primary database do not show the created datafile
belonging to the locally managed tablespace temp_local. The added tempfile
on the primary database doesn't change sys.file$. The redo is not generated
and not propagated to the standby database while the entry in sys.ts$ is,
but ther's no file. Issue on the standby database:
alter tablespace temp_local add tempfile 'path/temp_local01.dbf'
size 100M;
6- Afterward restart sustained or manual recovery when needed when no active
sessions are connected. If necessary open a new session as internal and issue
shutdown immediate;
TO ACTIVATE THE STANDBY DATABASE as PRODUCTION:
1. To activate the standby, first try to archive your current production
database logs
ALTER SYSTEM ARCHIVE LOG CURRENT
2. Then copy the most recent archive logs and current online redo log to the standby and apply them.
3. If your standby database has not timed out from your recovery, simply
open a new SQL session into the standby database, by using a DBA account,
and issue
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Locate the end of the standby database's alert.log, and identify the last
archived log that was applied. Manually apply any remaining logs to the standby
database:
ALTER DATABASE RECOVER [FROM 'pathname'] STANDBY DATABASE;
4. When you have applied the remaining logs to the standby database, stop
the recovery by issuing
ALTER DATABASE RECOVER CANCEL;
5. Convert the standby database to a production environment:
ALTER DATABASE ACTIVATE STANDBY DATABASE;
6. Then shutdown the standby to reset file headers and clear all buffers.
You can't copy online redo logs from the primary to the standby. The standby
is dismounted when activated. The standby bit in the controlfile is now set,
so you can never go back - the standby is now your primary database. Since
the redo log sequence was reset when the standby was opened, it is a good
idea to take a full backup at this point.
Common Errors and how to solve them.
starting the standby database:
TNS-01150: The address of the specified listener name is incorrect
-> make sure standby_listener and sid_list_standby_listener start
in the first column of the listener.ora and the # of brackets is even and
enough
alter user sys temporary tablespace temp_local;
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00372: file 1 cannot be modified at this time
ORA-01110: data file 1: '/ots2/sroo/test/system01.dbf'
-> you issued the statement while the db was in read only mode Issue this
statement on the primary database and have it propagated by means of the
archives.
stop the read only opened database:
shutdown immediate hangs,
or alter database close generates:
ORA-01093: ALTER DATABASE CLOSE only permitted with no sessions connected)
-> open a new session as internal and issue shutdown immediate;
when any user (including sys or system) is sorting after database is
opened read only:
ORA-01682, 00000, "read-only DB cannot allocate temporary space in tablespace
%s"
or
ORA-01647, 00000, "tablespace '%s' is read only, cannot allocate space
in it"
-> because database is opened read only, no (sort) segments can be allocated
in normal temporary tablespace , create a locally managed temporaray tablespace
beforehand and or make sure all the users (including sys
and system have this tablespace as their temporary tablespace
when any user (including sys or system) is sorting after database is
opened read only:
ORA-25153: Temporary Tablespace is Empty
-> add a temporary file at the standby database to the locally managed
temporary tablespace
when any user (including sys or system) is sorting after database is
opened read only:
ORA-01114: IO error writing block to file 201 (block # 13)
ORA-27063: skgfospo: number of bytes read/written is incorrect
SVR4 Error: 28: No space left on device
Additional information: -1
Additional information: 8192
ORA-01114: IO error writing block to file 201 (block # 13)
ORA-27063: skgfospo: number of bytes read/written is incorrect
SVR4 Error: 28: No space left on device
Additional information: -1
Additional information: 8192
-> Make sure there's enough space left when adding a tempfile to a locally
managed tablespace. BUG:1035595
when any user (including sys or system) is sorting after database is
opened read only:
ERROR at line 1:
ORA-01652: unable to extend temp segment by 1280 in tablespace TEMP_LOCAL
-> add another temporary file at the standby database to the locally managed
temporary tablespace, example:
alter tablespace temp_local add tempfile
'/ots2/sroo/test/temp_local02.dbf' size 200M;
Make sure there's enough space left when adding a tempfile
to a locally managed tablespace, BUG:1035595
when adding a temporary file at the standby database to the locally
managed
temporary tablespace (PART III step 3) after the standby database is opened
read only
alter tablespace temp_local add tempfile
'/ots2/sroo/test/temp_local01.dbf' size 10M;
alter tablespace temp_local add tempfile
*
ERROR at line 1:
ORA-03214: File Size specified is smaller than minimum required
-> make the size of file /ots2/sroo/test/temp_local01.dbf larger then it
will work
after stopping/mounting standby db:
ORA-3113
or in alert log of primary database:
ARC0: Beginning to archive log# 2 seq# 629
ARC0: RFS network connection lost at host 'sv816_nlhp2.world'
ARC0: Error 3113 creating standby archivelog file at host 'sv816_nlhp2.world'
ARC0: Error 3113 creating archivelog file 'sv816_nlhp2.world'
-> Copy the archives manual to the location of the standby_archive_destination
of the standby database and synchronize (as in step 4 of set up of standby
database) before setting the standby database in sustained managed recovery
mode
after stopping and starting standby database:
ARC0: RFS network connection lost at host 'sv817_nlsu22.world'
ARC0: Error 3113 creating standby archivelog file at host 'sv817_nlsu22.world'
ARC0: Error 3113 creating archivelog file 'sv817_nlsu22.world'
-> Check previous remark
ORA-01103:
after alter database mount standby database
ORA-01103: database name 'V816' in controlfile is not 'SV816'
-> do not change the parameters db_name = "v816", instance_name = v816
or service_names = v816 in the standby database init.ora they should be the
same as in the primary database's init.ora This error is not fatal, the database
is not mounted, just retry with correct settings
ORA-01990:
after alter database mount standby database
ORA-01990: error opening password file
-> set the init.ora parameter remote_login_passwordfile = exclusive of
the standby database to the value none
ORA-16032:
after recover automatic standby database;
ORA-16032: parameter LOG_ARCHIVE_DEST destination string cannot be translated
ORA-07286: sksagdi: cannot obtain device information.
HP-UX Error: 2: No such file or directory
-> check log_archive_dest and set to reachable path
after recover managed standby database;
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error
below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1:
'/unix1/app/oracle/oradata/v816/oradata/v816/testsroo/system01.dbf'
ORA-16037: user requested cancel of sustained recovery operation
-> Most likely the primary database was copied while open, make a new copy
as the standby database when the primary has been shutdown (immediate)
Also check [NOTE:119614.1] and [NOTE:119594.1]
after you tried to open the standby database read only
SQL> alter database open read only;
alter database open read only
*
ERROR at line 1:
ORA-16004: backup database requires recovery
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/ots2/sroo/test/system01.dbf'
-> Most likely you just mounted the standby database and you try to immediately
open the standby database read only: Just put the standby database in sustained
managed recovery mode
ORA-01102:
after alter database mount standby database;
ORA-01102: cannot mount database in EXCLUSIVE mode
-> if on the same node set init.ora lock_name_space =sv816 for the standby
database (the value is the same as the oracle_sid)
alert file shows messages
ARCH: Error connecting to standby host 'STBY Code '12514'
Mon Sep 18 13:23:33 2000
LGWR: prodding the archiver
ARCH: Error creating archivelog file 'STBY'
-> check log_archive_dest_2 = 'SERVICE=sv816_nlhp2.world OPTIONAL
REOPEN=60' for correct service, also check [NOTE:114085.1]. Be sure
to have s817_nlsu22.world starting on first column in tnsnames.ora
in 8.1.7 if the destination 2 is wrong you could try the correct destination
SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=
"SERVICE=s817_nlsu22.world OPTIONAL REOPEN=60";
ARC0: Error 16009 attaching RFS server to standby instance at host 's817_nlsu22'
ARC0: Error 16009 creating archivelog file 's817_nlsu22.world'
ORA-16009, 00000, "remote archivelog destination must be a STANDBY database"
-> this is a clear message, you probly mixed up your environment
alert file show messages
Tue Nov 28 16:26:26 2000
ARC0: Error 12154 connecting to standby host 'sv817_nlsu22.world'
ARC0: Error 12154 creating archivelog file 'sv817_nlsu22.world'
ARC0: Completed archiving log# 2 seq# 206
-> check if the standby instance is up, [NOTE:114085.1]
alert file shows messages
ARC0: Error 12154 connecting to standby host 'sv816_nlhp2.world'
ARC0: Error 12154 creating archivelog file 'sv816_nlhp2.world'
ARC0: Error 19504 creating archivelog file
'/unix1/app/oracle/oradata/v816/oradata/v816/backup/arch_1_384.arc'
ARC0: Archiving not possible: error count exceeded
ARC0: Failed to archive log# 1 seq# 384
ARCH: Archival stopped, error occurred. Will continue retrying
-> check your filesystem where your archive destination points to. The
filesystem is full, make space available