15 Best Practices for Capture

This chapter describes the best practices for capturing changes with a capture process in a Streams replication environment. This chapter includes these topics:

Best Practices for Source Database Configuration
Best Practices for Capture Process Configuration
Best Practices for Capture Process Operation

Best Practices for Source Database Configuration

A source database is a database where changes captured by a capture process are generated in the redo log. A local capture process can capture these changes at the source database, or a downstream capture process can capture these changes at a remote database. The following best practices pertain to all source databases, including source databases that use local capture processes and source databases that use downstream capture processes:

Enable Archive Logging at Each Source Database in a Streams Environment
Add Supplemental Logging at Each Source Database in a Streams Environment
Configure a Heartbeat Table at Each Source Database in a Streams Environment

Enable Archive Logging at Each Source Database in a Streams Environment

Verify that each source database is running in ARCHIVELOG mode. For downstream capture environments, the database where the source redo logs are created must be running in ARCHIVELOG mode.

See Also:

Oracle Database Administrator's Guide

Add Supplemental Logging at Each Source Database in a Streams Environment

Confirm that the required supplemental logging has been added at each source database. Supplemental logging is required for a table if the table's changes will be captured by a capture process. In Oracle Database 10g Release 2 and higher, supplemental logging is automatically added for primary key, unique index, and foreign key columns in a table when the table is prepared for instantiation. The procedures in the DBMS_STREAMS_ADM package for maintaining database objects with Streams and for adding rules automatically prepare objects for a instantiation. For downstream capture, supplemental logging must be added at the database where the source redo logs are generated.

All indexed columns at a destination database, including the primary key, unique index, and foreign key columns of each replicated table, must be supplementally logged at the source database. Primary key columns must be unconditionally logged, but unique index and foreign key columns can be conditionally logged. In some cases, supplemental logging is required for additional columns in a table. For example, any columns specified in rule-based transformations or used within DML handlers must be unconditionally logged at the source database. Supplemental logging for these columns must be configured explicitly by the database administrator.

See Also:

"Supplemental Logging for Streams Replication"
"Managing Supplemental Logging in a Streams Replication Environment"
"Monitoring Supplemental Logging" for queries that can be used to verify that the required supplemental logging has been added to each source database

Configure a Heartbeat Table at Each Source Database in a Streams Environment

You can use a heartbeat table to ensure that changes are being replicated in a Streams replication environment. Specifically, you can check the APPLIED_SCN value in the DBA_CAPTURE data dictionary view at the capture database to make sure it is updated periodically. A heartbeat table is especially useful for databases that have a low activity rate because you can make sure the replication environment is working properly even if there are not many replicated changes.

A Streams capture process requests a checkpoint after every 10 MB of generated redo. During the checkpoint, the metadata for Streams is maintained if there are active transactions. Implementing a heartbeat table ensures that there are open transactions occurring regularly in the source database, thereby providing additional opportunities for the metadata to be updated frequently. Additionally, the heartbeat table provides quick feedback to the database administrator about the health of the Streams replication environment.

To implement a heartbeat table, complete the following steps:

Create a table at the source database that includes a date or timestamp column and the global name of the source database.
Instantiate the table at the destination database. If there are multiple destination databases, then instantiate the heartbeat table at each destination database.
Add a rule to the positive rule set for the capture process that captures changes to the source database. The rule instructs the capture process to capture changes to the heartbeat table.
Add a rule to the positive rule set for the propagation that propagates changes from the source database to the destination database. The rule instructs the propagation to propagate LCRs for the heartbeat table. If there are multiple propagations, then add the rule to the rule set for each propagation. If your environment uses directed networks, then you might need to add rules to propagations at several databases.
Add a rule to the positive rule set for the apply process that applies changes that originated at the source database. The rule instructs the apply process to apply changes to the heartbeat table. If there are multiple apply processes at multiple databases that apply the changes that originated at the source database, then add a rule to each the apply process.
Configure an automated job to update the heartbeat table at the source database periodically. For example, the table might be updated every minute.
Monitor the Streams replication environment to verify that changes to the heartbeat table at the source database are being replicated to the destination database.

Best Practices for Capture Process Configuration

The following sections describe best practices for configuring capture processes:

Set Capture Process Parallelism
Set the Checkpoint Retention Time

Set Capture Process Parallelism

Set the parallelism of each capture process by specifying the parallelism parameter in the DBMS_CAPTURE_ADM.SET_PARAMETER procedure. The parallelism parameter controls the number of parallel execution servers that concurrently mine the redo log for changes.

The default setting for the parallelism capture process parameter is 1, and the default parallelism setting is appropriate for most capture process configurations. Make sure the PROCESSES and PARALLEL_MAX_SERVERS initialization parameters are set appropriately when you set the parallelism capture process parameter.

See Also:

Set the Checkpoint Retention Time

Set the checkpoint retention time for each capture process. Periodically, a capture process takes a checkpoint to facilitate quicker restart. These checkpoints are maintained in the SYSAUX tablespace by default. The checkpoint retention time for a capture process controls the amount of checkpoint data it retains. The checkpoint retention time specifies the number of days prior to the required checkpoint SCN to retain checkpoints. When a checkpoint is older than the specified time period, the capture process purges the checkpoint.

When checkpoints are purged, the first SCN for the capture process moves forward. The first SCN is the lowest possible SCN available for capturing changes. The checkpoint retention time is set when you create a capture process, and it can be set when you alter a capture process. When the checkpoint retention time is exceeded, the first SCN is moved forward, and the Streams metadata tables prior to this new first SCN is purged. The space used by these tables in the SYSAUX tablespace is reclaimed. To alter the checkpoint retention time for a capture process, use the ALTER_CAPTURE procedure in the DBMS_CAPTURE_ADM package and specify the new retention time with the checkpoint_retention_time parameter.

The default value for the checkpoint retention time is 60 days. If checkpoints are available for a time in the past, then the capture process can be used to recapture changes to recover a destination database. You should set the checkpoint retention time to an appropriate value for your environment. A typical setting is 7 days.

See Also:

Oracle Streams Concepts and Administration
Oracle Database PL/SQL Packages and Types Reference for more information about the ALTER_CAPTURE procedure

Best Practices for Capture Process Operation

The following sections describe best practices for operating existing capture processes:

Perform a Dictionary Build and Prepare Database Objects for Instantiation Periodically
Minimize the Performance Impact of Batch Processing

Perform a Dictionary Build and Prepare Database Objects for Instantiation Periodically

Perform a data dictionary build in the source database redo periodically. Run the DBMS_CAPTURE_ADM.BUILD procedure to build a current copy of the data dictionary in the redo log. Ideally, database objects should be prepared for instantiation after a build is performed. Run one or more of the following procedures in the DBMS_CAPTURE_ADM package to prepare database objects for instantiation:

PREPARE_GLOBAL_INSTANTIATION
PREPARE_SCHEMA_INSTANTIATION
PREPARE_TABLE_INSTANTIATION

Each of the database objects for which a capture process captures changes should be prepared for instantiation periodically. You can reduce the amount of redo data that must be processed if additional capture process are created or if an existing capture process must be re-created by performing a build and preparing shared objects for instantiation periodically.

See Also:

Minimize the Performance Impact of Batch Processing

For best performance, the commit point for a batch processing job should be kept low. Also, if a large batch processing job must be run at a source database, then consider running it at each Streams replication database independently. If this technique is used, then make sure the changes resulting from the batch processing job are not replicated. To accomplish this, run the DBMS_STREAMS.SET_TAG procedure in the session that runs the batch processing job, and set the session tag to a value that will not be captured by a capture process.

See Also:

Chapter 4, "Streams Tags"