Skip Headers
Oracle® Database High Availability Overview
10g Release 2 (10.2)

Part Number B14210-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

4 High Availability Architectures

This chapter describes high availability architectures in an Oracle environment. It includes the following sections:

4.1 Oracle Database High Availability Architectures

Oracle Database 10g provides a full range of capabilities to protect from all causes of system downtime, both planned and unplanned. Table 4-1 shows the outage types and the Oracle database capabilities and features that most effectively prevent, tolerate, or repair each outage type.

This section describes the following top database architectures that address various high availability business needs:

Oracle provides a wide array of high availability architectural solutions. The Oracle Database 10g architecture contains many availability features and assets that are used by all other architectures and is the starting point for most customers. Oracle Database 10g with RAC, Oracle Database 10g with Data Guard, and Oracle Database 10g with Streams provide additional high availability capabilities in addition to the Oracle Database 10g capabilities. MAA incorporates both RAC and Data Guard advantages and represents the architecture with maximum availability. Choosing an architecture with more availability features does not necessarily lead to higher costs. As a matter of fact, RAC technology and GRID computing enable a more available and resilient architecture to be attained with lower total cost of ownership than most legacy high availability features. Figure 4-1 illustrates the hierarchy of the different high availability architectures.

Figure 4-1 Hierarchy of High Availability Architectures

Description of Figure 4-1 follows
Description of "Figure 4-1 Hierarchy of High Availability Architectures"

The following sections provide further details on the various Oracle database high availability architectures:

4.1.1 Oracle Database 10g

Oracle provides high availability features that can be used in any of the database architectures. These features make the standalone database on a single computer attractive and available:

4.1.2 Oracle Database 10g with RAC

Oracle Database 10g with RAC architecture uses Real Application Clusters and is an inherently high availability system.The clusters that are typical of RAC environments can provide continuous service for both planned and unplanned outages. RAC build higher levels of availability on top of the standard Oracle features. All single instance high availability features, such as flashback technologies and online reorganization, apply to RAC as well.

In addition to the standard Oracle features, RAC exploits the redundancy that is provided by clustering to deliver availability with n - 1 node failures in an n-node cluster. All users have access to all nodes as long as there is one available node in the cluster.

This architecture provides the following benefits:

  • Fast node (measured in minutes) and instance failover (measured in seconds)

  • Integrated and intelligent connection and service failover across various instances

  • Planned node, instance, and service switchover and switchback

  • Rolling patch upgrades

  • Rolling release upgrades of Oracle Clusterware

  • Multiple active instance availability and scalability across multiple nodes

  • Comprehensive manageability that integrates database and cluster features

  • Extensive cluster and application services that allows the database and application services to be restarted or relocated in case of failures

Figure 4-2 shows Oracle Database 10g with RAC architecture.

Figure 4-2 Oracle Database 10g with RAC Architecture

Description of Figure 4-2 follows
Description of "Figure 4-2 Oracle Database 10g with RAC Architecture"

4.1.3 Oracle Database 10g with Data Guard

Oracle Data Guard ensures high availability, data protection, and disaster recovery for enterprise data. Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable Oracle databases to survive disasters and data corruption. Data Guard maintains these standby databases as transactionally consistent copies of the production database. If the production database becomes unavailable due to a planned or unplanned outage, Data Guard can switch any standby database to the production role, minimizing the downtime associated with the outage. Data Guard can be used with traditional backup, restoration, and cluster technologies to provide a high level of data protection and availability. With Data Guard, administrators can optionally improve production database performance by diverting resource-intensive backup and reporting operations to standby systems.

Using a backup copy of the primary database, it is possible to create up to nine standby databases and integrate them in a Data Guard configuration. Once created, Data Guard automatically maintains each standby database by transmitting redo data from the primary database and applying the redo to the standby database. Similar to a primary database, a standby database can be either an Oracle single-instance or RAC database.

A standby database can be either a physical standby database or a logical standby database. A physical standby database provides a physically identical copy of the primary database, with on disk database structures that are identical to the primary database on a block-for-block basis. A physical standby database is synchronized with the primary database through Redo Apply, which recovers the redo data received from the primary database and applies it to the physical standby database. A physical standby database can be used for business purposes other than disaster recovery on a limited basis.

Physical standby databases provide these advantages:

  • Protection from user errors and logical corruption

  • Protection from disasters and site failures if located remotely

  • Fast site and database failover (less than 1 minute to five minutes)

  • Fast-start failover provides the ability to automatically, quickly, and reliably fail over to a designated, synchronized standby database in the event of primary database failure

  • Fast site and database planned switchovers for maintenance

  • Using Flashback Database, a Redo Apply standby database can diverge for reporting or testing purposes and resynchronize with its primary database once complete

  • Backups can be taken from the physical standby database instead of the production database, relieving the load on the production database

  • Read-only capability, resulting in better use of system resources

  • Greater support for fast application notification and application callouts resulting in better full-stack application failover

A logical standby database can be used for other business purposes in addition to disaster recovery. Users can access a logical standby database for queries and reporting purposes. Using a logical standby database, it is possible to upgrade Oracle database software and patch sets with minimal downtime. Therefore, a logical standby database can be concurrently used for data protection, reporting, and database upgrade purposes.

In addition to disaster recovery and data protection, logical standby databases provide the following benefits:

  • Enable the standby database to be open for normal operations with both read-only and read/write accessibility

  • Enable additional objects to be built and maintained

  • Enable rolling database upgrades of the production database

A recommended configuration for many cases includes both physical and logical standby databases. They can reside on the same database computer or cluster, but they should be remote from the production database. The physical standby database can be reserved for failovers in case of disaster, and the logical standby database can continue to be used for reporting. The physical standby database provides a faster apply technology because redo logs do not have to be converted to SQL.

Figure 4-3 shows the production database at the primary site and the standby databases at the secondary site.

Figure 4-3 Oracle Database 10g with Data Guard Architecture on Primary and Secondary Sites

Description of Figure 4-3 follows
Description of "Figure 4-3 Oracle Database 10g with Data Guard Architecture on Primary and Secondary Sites"

See Also:

4.1.4 Oracle Database 10g with RAC and Data Guard - MAA

RAC and Data Guard provide the basis of Oracle Database 10g - MAA. Maximum Availability Architecture (MAA) provides the most comprehensive architecture for reducing downtime for scheduled outages and preventing, detecting, and recovering from unscheduled outages. The recommended MAA has two identical sites. The primary site contains the RAC primary database, and the secondary site contains a RAC standby database.

Identical site configuration is recommended to ensure that performance is not sacrificed after a failover or switchover. Symmetric sites also enable processes and procedures to be kept the same between sites, making operational tasks easier to maintain and execute.

MAA encompasses RAC, Data Guard, and a set of recommended best practices for configuring and managing the architecture as well as recovering from various outages. For more information, visit the MAA Web site at:

http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm

Figure 4-4 provides an overview of Oracle Database 10g with RAC and Data Guard - MAA.

Figure 4-4 Oracle Database 10g with RAC and Data Guard - MAA

Description of Figure 4-4 follows
Description of "Figure 4-4 Oracle Database 10g with RAC and Data Guard - MAA"

4.1.5 Oracle Database 10g with Streams

Oracle Streams is meant for information sharing and distribution. It can also provide an efficient and highly available architecture.

Oracle Database 10g with Streams provides granularity and control over what is replicated and how it is replicated. It supports bidirectional replication, data transformations, subsetting, custom apply functions, and heterogeneous platforms. It also gives users complete control over the routing of change records from the primary database to a replica database. The capture of data changes can be performed at the primary database or downstream at a replica database. This enables users to build hub and spoke network configurations that can support hundreds of replica databases.

Oracle Database 10g with Streams should be evaluated if one or more of the following conditions are true:

  • A full active/active site configuration is required, including bidirectional changes

  • Site configurations are on heterogeneous platforms

  • Different character sets are required between the primary database and its replicas

  • Fine control of information and data sharing are required

  • More investment and expertise to build and maintain an integrated high availability solution is available

For disaster recovery, Data Guard is Oracle's recommended solution.

Figure 4-5 shows Oracle Database 10g with Streams with local capture running at the primary database.

Figure 4-5 Oracle Database 10g with Streams

Description of Figure 4-5 follows
Description of "Figure 4-5 Oracle Database 10g with Streams"

4.2 Choosing the Correct High Availability Architecture

This section summarizes the advantages of the high availability architectures discussed in this chapter and provides guidelines for you to choose the correct high availability architecture for your business.

Oracle Database 10g with RAC and Oracle Database 10g with Data Guard are the most common Oracle high availability architectures, and each provides very significant high availability advantages. MAA provides the most redundant and robust high availability architecture. It prevents, detects, and recovers from different outages to meet stringent RTO and RPO requirements, as well as preventing or minimizing downtime for maintenance. Oracle Database 10g with Streams is an alternative high availability solution, but it requires more customization and administrative effort, and may not be as transparent to the application.

The baseline high availability architecture is Oracle Database 10g. Consider using:

Table 4-2 identifies the additional capabilities provided by the architectures that build upon Oracle Database 10g.

Table 4-2 Additional Capabilities of High Level Oracle High Availability Architectures

Oracle High Availability Architecture Key Characteristics and Additional Capabilities

Oracle Database 10g with RAC


Transparent to application
Fast repair for human error
Fast failover for computer failure and storage failure
Scalability beyond a single system
Reduced downtime for computer maintenance

Oracle Database 10g with Data Guard


Transparent to application
Fast repair for human error
Fast failover for computer failure, storage failure, and data corruption
Protection from site failure
Reduced downtime for computer or site maintenance

Oracle Database 10g with RAC and Data Guard - MAA


Transparent to application
Fast repair for human error
Fast failover for computer failure, storage failure, and data corruption
Protection from site failure
Scalability beyond a single system
Reduced downtime for computer or site maintenance

Oracle Database 10g with StreamsFoot 1 


Fast repair for human error
Replica database(s) available for read/write use
Provides heterogeneous platform support
Fast failover for computer failure and storage failure
Protection from site failure
Reduced downtime for computer or site maintenance

Footnote 1 Requires planning and overhead to make solution robust

Table 4-3 shows the attainable recovery times for all types of unplanned downtime for each Oracle high availability architecture.

Table 4-3 Attainable Recovery Times for Unplanned Outages

Outage Type Oracle Database 10g RAC Data Guard MAA Streams

Computer failure

Minutes to hoursFoot 1 

No downtimeFoot 2 

Seconds to five minutes

No downtime

No downtime

Storage failure

No downtimeFoot 3 

No downtime3

No downtime3

No downtime3

No downtime3

Human error

< 30 minutesFoot 4 

< 30 minutes4

< 30 minutes4

< 30 minutes4

< 30 minutes4

Data corruption

HARD prevents data corruptionFoot 5 

Potentially hoursFoot 6 

HARD prevents data corruption5

Potentially hours6

HARD prevents data corruption5

Seconds to five minutes

HARD prevents data corruption5

Seconds to five minutes

HARD prevents data corruption5

Seconds to five minutes

Site failure

Hours to days

Hours to days

Seconds to five minutesFoot 7 

Seconds to five minutesFoot 8 

No downtime7


Footnote 1 Recovery time consists largely of the time it takes to restore the failed system.

Footnote 2 Database is still available, but portion of application connected to failed system is temporarily affected.

Footnote 3 Storage failures are prevented by using ASM with mirroring and its automatic rebalance capability.

Footnote 4 Recovery time for human errors depend primarily on detection time. If it takes seconds to detect a malicious DML or DDL transaction, it typically only requires seconds to flashback the appropriate transactions. Longer detection time usually leads to longer recovery time required to repair the appropriate transactions. An exception is undropping a table, which is literally instantaneous regardless of detection time.

Footnote 5 Not all types of data corruption are prevented. For the most recent information about the HARD initiative, refer to http://www.oracle.com/technology/deploy/availability/htdocs/HARD.html.

Footnote 6 Recovery time depends on the age of the backup used for recovery and the number of log changes scanned to make the corrupt data consistent with the database.

Footnote 7 Recovery time indicated applies to database and existing connection failover. Network connection changes and other site-specific failover activities may lengthen overall recovery time.

Footnote 8 The portion of any application connected to the failed system is temporarily affected.

Table 4-4 shows the attainable recovery times for all types of planned downtime for each Oracle high availability architecture.

Table 4-4 Attainable Recovery Times for Planned Outages

Outage Type Oracle Database 10g RAC Data Guard MAA Streams

System changes - Dynamic Resource Provisioning

No downtime

No downtime

No downtime

No downtime

No downtime

System changes - Rolling Upgrades

System level upgrade

Minutes to hours

No downtime

Seconds to five minutes

No downtime

No downtime

Cluster or site wide upgrade

Minutes to hours

Minutes to hours

Seconds to five minutes

Seconds to five minutes

No downtimeFoot 1 

Storage migration

No downtimeFoot 2 

No downtime2

No downtime2

No downtime2

No downtime2

Database one-off patch

Minutes to hours

No downtimeFoot 3 

Seconds to five minutes

No downtime3

No downtime

Database patchset and version upgrade

Minutes to hours

Minutes to hours

Seconds to five minutes

Seconds to five minutes

No downtime1

Platform migration

Minutes to hours

Minutes to hours

Minutes to hours

Minutes to hours

No downtime1

Data changes - Online Reorganization and Redefinition

No downtime

No downtime

No downtimeFoot 4 

No downtime4

No downtime4


Footnote 1 The portion of any application connected to the failed system is temporarily affected.

Footnote 2 ASM automatically rebalances stored data when disks are added or removed while the database remains online. For storage migration, you will require both storage arrays to be leveraged by ASM temporarily.

Footnote 3 For qualified one-off patches only

Footnote 4 Tables can be reorganized online using the DBMS_REDEFINITION package. However, the online changes are not supported by SQL Apply or data capture, and therefore the effects of this subprogram are not visible on the logical standby or replica database. For more information, see Oracle Data Guard Concepts and Administration or Oracle Streams Replication Administrator's Guide.

4.3 Assessing Other Architectures

There are other Oracle and non-Oracle high availability and enterprise computing architectures. This section focuses on the most common variants.

Table 4-5 describes common alternative high availability architecture, their disadvantages, and the recommended Oracle high availability architectures.

Table 4-5 Comparison of High Availability Architectures

Alternative Architecture Disadvantages Recommended Oracle Architecture

Single instance database on hardware cluster

  • Node failure requires restart, storage remastering, and reconnect for all application connectors

  • No protection from disaster and site failure

  • No protection against data corruption beyond Oracle Database 10g capabilities

  • Limited ability to reduce downtime for system rolling upgrades

  • Inability to reduce downtime for Oracle upgrades

  • Under-utilized hardware resources

  • No database server scalability beyond one node

  • Inability to offload database activities such as backup or reporting

  1. Oracle Database 10g with RAC

  2. Oracle Database 10g with Data Guard

  3. Oracle Database 10g with RAC and Data Guard - MAA

Remote mirrored single instance database

  • High network utilization

  • No protection against data corruption beyond Oracle Database 10g capabilities

  • Site failure requires instance restart, storage remastering, and reconnect for all applications connections

  • No database server scalability beyond primary node(s)

  • Inability to offload database activities such as backup or reporting

  • Inability to reduce downtime for rolling upgrades

  • Customization required

  1. Oracle Database 10g with Data Guard

  2. Oracle Database 10g with RAC and Data Guard - MAA

RAC database in a stretch cluster configuration

  • No protection against data corruption beyond Oracle Database 10g capabilities

  • Limited protection against site failures with regional impact

  • Limited ability to reduce downtime for rolling upgrades

  • High network utilization

  • Limited by distance between nodes in cluster

  • As network latency increases, application performance can be impacted significantly

  1. Oracle Database 10g with Data Guard

  2. Oracle Database 10g with RAC and Data Guard - MAA

RAC database with standby database on same site

  • No protection from site failures

Oracle Database 10g with RAC and Data Guard - MAA

Single instance database with standby database on same site

  • No protection from site failures

  • Limited ability to reduce downtime for rolling upgrades

Oracle Database 10g with RAC and Data Guard - MAA


Table 4-6 describes common traditional enterprise computing architectures, their disadvantages, and the recommended Oracle enterprise computing architectures.

Table 4-6 Comparison of Enterprise Computing Architectures

Traditional Architecture Disadvantages Recommended Architecture

Monolithic database server

  • High cost

  • Fixed scalability

  • Not flexible to changing capacity and resource demands

Database Server Grid

Monolithic storage array

  • High cost

  • Fixed scalability

  • Not flexible to changing capacity and resource demands

Database Storage Grid