13 Monitoring Performance

This chapter describes how to monitor Oracle Real Application Clusters (Oracle RAC) performance. This chapter explains how to verify the interconnect settings, how to use the performance views for Oracle RAC, and descriptions of Oracle RAC-specific statistics and how to monitor them. This chapter describes Automatic Workload Repository (AWR), how to analyze Oracle RAC statistics, and how to analyze Oracle RAC wait events. This chapter also explains how to monitor performance in Oracle RAC databases with Enterprise Manager. This chapter contains these topics:

Overview of Monitoring Oracle Real Application Clusters Databases
Verifying the Interconnect Settings for Oracle Real Application Clusters
Performance Views in Oracle Real Application Clusters
Oracle Real Application Clusters Performance Statistics
Automatic Workload Repository in Oracle Real Application Clusters Environments
Monitoring Oracle Real Application Clusters Statistics and Events
Monitoring Performance with Oracle Enterprise Manager

Overview of Monitoring Oracle Real Application Clusters Databases

All single-instance tuning practices for Oracle Database apply to applications running on Oracle RAC databases. Therefore, implement the single-instance tuning methodologies described in Oracle Database Performance Tuning Guide.

Verifying the Interconnect Settings for Oracle Real Application Clusters

The interconnect and internode communication protocols can affect Cache Fusion performance. In addition, the interconnect bandwidth, its latency, and the efficiency of the IPC protocol determine the speed with which Cache Fusion processes block transfers.

Influencing Interconnect Processing

Once your interconnect is operative, you cannot significantly influence its performance. However, you can influence an interconnect protocol's efficiency by adjusting the IPC buffer sizes.

The Oracle Cluster Registry (OCR) stores your system's interconnect information. Use the oifcfg command or OCRDUMP utility to identify the interconnect that you are using. You can then change the interconnect that you are using by running an oifcfg command.

See Also:

Your vendor-specific interconnect documentation for more information about adjusting IPC buffer sizes

Although you rarely need to set the CLUSTER_INTERCONNECTS parameter, you can use it to assign a private network IP address or NIC as in the following example:

CLUSTER_INTERCONNECTS=10.0.0.1

If you are using an operating system-specific vendor IPC protocol, then the trace information may not reveal the IP address.

Note:

You can also use the oifcfg command as described in "Administering System and Network Interfaces with OIFCFG" to assign private network or private IP addresses.

Performance Views in Oracle Real Application Clusters

Each instance has a set of instance-specific views. You can also query global dynamic performance views to retrieve performance information from all of the qualified instances. Global dynamic performance view names are prefixed with GV$. A global view contains all columns from its respective instance-specific view as well as the INST_ID column. The instance number is also appended to the names of the archived redo log threads to create a unique identifier for each instance's archived redo logs.

See Also:

Oracle Database Reference for restrictions on GV$ views and complete descriptions of related parameters and views

Creating Oracle Real Application Clusters Data Dictionary Views with CATCLUST.SQL

If you did not create your Oracle RAC database with the Database Configuration Assistant (DBCA), then you must run the CATCLUST.SQL script to create Oracle RAC-related views and tables. You must have SYSDBA privileges to run this script.

See Also:

our Oracle Real Application Clusters installation and configuration guide for more information about creating your Oracle RAC database

Oracle Real Application Clusters Performance Statistics

This section provides an overview of the V$ and GV$ views that provide statistics that you can use evaluate block transfers in your cluster. Use these statistics to analyze interconnect block transfer rates as well as the overall performance of your Oracle RAC database.

The Content of Oracle Real Application Clusters Statistics

Oracle RAC-specific statistics appear as message request counters or as timed statistics. Message request counters include statistics showing the number of certain types of block mode conversions. Timed statistics reveal the total or average time waited for read and write I/O for particular types of operations.

Automatic Workload Repository in Oracle Real Application Clusters Environments

In Oracle RAC environments, each Automatic Workload Repository (AWR) snapshot captures data from all active instances within the cluster. The data for each snapshot set that is captured for all active instances is from the same point in time. In addition, the data for each instance is stored separately and is identified with an instance identifier. For example, the buffer_busy_wait statistic shows the number of buffer waits on each instance. AWR does not store data that is aggregated from across the entire cluster. In other words, the data is stored for each individual instance.

Monitoring Oracle Real Application Clusters Statistics and Events

This section explains wait events and statistics specific to Oracle RAC and how to interpret them when assessing performance data generated by the Automatic Workload Repository, Statspack, or by ad-hoc queries of the dynamic performance views.

See Also:

Oracle Database Performance Tuning Guide for more information about wait event analysis and the spdoc.txt file for details about the Statspack utility

Oracle RAC Statistics and Events in AWR and Statspack Reports

The statistics snapshots generated by AWR and Statspack can be evaluated by producing reports displaying summary data such as load and cluster profiles based on regular statistics and wait events gathered on each instance.

Most of the relevant data is summarized on the Oracle RAC Statistics Page. This information includes:

Global cache load profile
Global cache efficiency percentages: workload characteristics
Global cache and Enqueue Service (GES): messaging statistics

Additional Oracle RAC-related sections appear later in the report:

Global enqueue statistics
Global CR statistics
Global CURRENT served statistics
Global cache transfer statistics.

Oracle Real Application Clusters Wait Events

Analyzing and interpreting what sessions are waiting for is an important method to determine where time is spent. In Oracle RAC, the wait time is attributed to an event which reflects the exact outcome of a request. For example, when a session on an instance is looking for a block in the global cache, it does not know whether it will receive the data cached by another instance or whether it will receive a message to read from disk. The wait events for the global cache now convey precise information and waiting for global cache blocks or messages is:

Summarized in a broader category called Cluster Wait Class
Temporarily represented by a placeholder event which is active while waiting for a block, for example:
- gc current block request
- gc cr block request
Attributed to precise events when the outcome of the request is known, for example:
- gc current block 3-way
- gc current block busy
- gc cr block grant 2-way

In summary, the wait events for Oracle RAC convey information valuable for performance analysis. They are used in Automatic Database Diagnostic Monitor (ADDM) to enable precise diagnostics of the impact of cache fusion.

Monitoring Performance by Analyzing GCS and GES Statistics

In order to determine the amount of work and cost related to inter-instance messaging and contention, examine block transfer rates, remote requests made by each transaction, the number and time waited for global cache events as described under the following headings:

Analyzing Cache Fusion Impact in Oracle Real Application Clusters
Analyzing Performance Using GCS and GES Statistics

Analyzing Cache Fusion Impact in Oracle Real Application Clusters

The effect of accessing blocks in the global cache and maintaining coherency is represented by

The Global Cache Service statistics for current and cr blocks, for example, gc current blocks received, gc cr blocks received, and so on)
The Global Cache Service wait events, for gc current block 3-way, gc cr grant 2-way, and so on.

The response time for cache fusion transfers is determined by the messaging and processing times imposed by the physical interconnect components, the IPC protocol and the GCS protocol. It is not affected by disk I/O factors other than occasional log writes. The cache fusion protocol does not require I/O to data files in order to guarantee cache coherency and Oracle RAC inherently does not cause any more I/O to disk than a non-clustered instance.

Analyzing Performance Using GCS and GES Statistics

This section describes how to monitor Global Cache Service performance by identifying data blocks and objects which are frequently used ("hot") by all instances. High concurrency on certain blocks may be identified by Global Cache Service wait events and times.

The gc current block busy wait event indicates that the access to cached data blocks was delayed because they were busy either in the remote or the local cache. This means that the blocks were pinned or held up by sessions or delayed by a log write on a remote instance or that a session on the same instance is already accessing a block which is in transition between instances and the current session needs to wait behind it (for example, gc current block busy).

The V$SESSION_WAIT view to identify objects and data blocks with contention. The gc wait events contain the file and block number for a block request in p1 and p2, respectively.

An additional segment statistic, gc buffer busy, has been added to quickly determine the "busy" objects without recourse to the query on V$SESSION_WAIT mentioned earlier.

The AWR infrastructure provides a view of active session history which can also be used to trace recent wait events and their arguments. It is therefore useful for hot block analysis. Most of the reporting facilities used by AWR and Statspack contain the object statistics and cluster wait class category, so that sampling of the views mentioned earlier is largely unnecessary.

It is advisable to run ADDM on the snapshot data collected by the AWR infrastructure to obtain an overall evaluation of the impact of the global cache. The advisory will also identify the busy objects and SQL highest cluster wait time.

Analyzing Cache Fusion Transfer Impact Using GCS Statistics

This section describes how to monitor Global Cache Service performance by identifying objects read and modified frequently and the service times imposed by the remote access. Waiting for blocks to arrive may constitute a significant portion of the response time, in the same way that reading from disk could increase the block access delays, only that cache fusion transfers in most cases are faster than disk access latencies.

The following wait events indicate that the remotely cached blocks were shipped to the local instance without having been busy, pinned or requiring a log flush:

gc current block 2-way
gc current block 3-way
gc cr block 2-way
gc cr block 3-way

The object statistics for gc current blocks received and gc cr blocks received enable quick identification of the indexes and tables which are shared by the active instances. As mentioned earlier, creating an ADDM analysis will, in most cases, point you to the SQL statements and database objects that could be impacted by inter-instance contention.

Note:

You must run Statspack at level 7 to collect statistics related to block contention and segment block waits.

Any increases in the average wait times for the events mentioned earlier could be caused by the following occurrences:

High load: CPU shortages, long run queues, scheduling delays
Misconfiguration: using public instead of private interconnect for message and block traffic

If the average wait times are acceptable and no interconnect or load issues can be diagnosed, then the accumulated time waited can usually be attributed to a few SQL statements which need to be tuned to minimize the number of blocks accessed.

The column CLUSTER_WAIT_TIME in V$SQLAREA represents the wait time incurred by individual SQL statements for global cache events and will identify the SQL which may need to be tuned.

Analyzing Response Times Based on Wait Events

Most global cache wait events that show a high total time as reported in the AWR and Statspack reports or in the dynamic performance views are normal and may present themselves as the top database time consumers without actually indicating a problem. This section describes the most important and frequent wait events that you should be aware of when interpreting performance data.

If user response times increases and a high proportion of time waited is for global cache (gc), then the cause should be determined. Most reports include a breakdown of events sorted by percentage of the total time.

It is useful to start with an ADDM report, which would analyze the routinely collected performance statistics with respect to their impact and point to the objects and SQL contributing most to the time waited, and then move on to the more detailed reports produced by AWR and Statspack. The most important wait events for Oracle RAC include various categories, such as:

Block-oriented
- gc current block 2-way
- gc current block 3-way
- gc cr block 2-way
- gc cr block 3-way
Message-oriented
- gc current grant 2-way
- gc cr grant 2-way
Contention-oriented
- gc current block busy
- gc cr block busy
- gc buffer busy
Load-oriented
- gc current block congested
- gc cr block congested

The block-oriented wait event statistics indicate that a block was received as either the result of a 2-way or a 3-way message, that is, the block was sent from either the resource master requiring 1 message and 1 transfer, or was forwarded to a third node from which it was sent, requiring 2 messages and 1 block transfer.

The gc current block busy and gc cr block busy wait events indicate that the remote instance received the block after a remote instance processing delay. In most cases, this is due to a log flush. The existence of these wait events does not necessarily characterize high concurrency for the blocks. High concurrency is instead evidenced by the existence of the gc buffer busy event. This event indicates that the block was pinned or held up by a session on a remote instance. It can also indicate that a session on the same instance has already requested the block, which in either case is in transition between instances and the current session needs to wait behind it.

These events are usually the most frequent in the absence of block contention and the length of the wait is determined by the time it takes on the physical network, the time to process the request in the serving instances and the time it takes for the requesting process to wake up after the block arrives.

The average wait time and the total wait time should be considered when being alerted to performance issues where these particular waits have a high impact. Usually, either interconnect or load issues or SQL execution against a large shared working set can be found to be the root cause.

The message-oriented wait event statistics indicate that no block was received because it was not cached in any instance. Instead a global grant was given, enabling the requesting instance to read the block from disk or modify it.

If the time consumed by these events is high, then it may be assumed that the frequently used SQL causes a lot of disk I/O (in the event of the cr grant) or that the workload inserts a lot of data and needs to find and format new blocks frequently (in the event of the current grant).

The contention-oriented wait event statistics indicate that a block was received which was pinned by a session on another node, was deferred because a change had not yet been flushed to disk or because of high concurrency, and therefore could not be shipped immediately. A buffer may also be busy locally when a session has already initiated a cache fusion operation and is waiting for its completion when another session on the same node is trying to read or modify the same data. High service times for blocks exchanged in the global cache may exacerbate the contention, which can be caused by frequent concurrent read and write accesses to the same data.

The load-oriented wait events indicate that a delay in processing has occurred in the GCS, which is usually caused by high load, CPU saturation and would have to be solved by additional CPUs, load-balancing, off loading processing to different times or a new cluster node.For the events mentioned, the wait time encompasses the entire round trip from the time a session starts to wait after initiating a block request until the block arrives.

Monitoring Performance with Oracle Enterprise Manager

Oracle Enterprise Manager is an Oracle Web-based integrated management solution for monitoring and administering your computing environment. From any location where you can access a Web browser, you can manage Oracle RAC databases, application servers, host computers, and Web applications, as well as related hardware and software. For example, you can monitor your Oracle RAC database performance from your office, home, or a remote site, as long as you have access to a Web browser.

See Also:

Oracle Enterprise Manager Concepts and the Enterprise Manager Online Help for more information about Enterprise Manager

Overview of Enterprise Manager Monitoring

You can install, configure, and monitor an Oracle RAC database from a single location using either Oracle Enterprise Manager Database Control or Oracle Enterprise Manager Grid Control as follows:

Oracle Enterprise Manager Database Control is designed to manage a single Oracle RAC10g Database with its instance, Listener, host, cluster, and Automatic Storage Management (ASM) targets. The Database Configuration Assistant (DBCA) automatically configures Database Control when you create a database.
Oracle Enterprise Manager Grid Control enables you to centrally manage your entire computing enterprise. You can install Grid Control separately onto servers either inside or outside of your cluster to manage multiple Oracle RAC databases and multiple cluster targets. Grid Control consists of the Grid Control console and one or more Oracle Management Services, the Oracle Management Repository, and remote Oracle Management Agents installed on each monitored host.

Both Database Control and Grid Control are cluster-aware and provide a central console to manage your cluster database. From the Cluster Database pages, you can:

View the overall system status, such as the number of nodes in the cluster database and their current statuses. This high-level view capability means that you do not have to access each individual database instance for details if you just want to see inclusive, aggregated information.
View alerts aggregated across all of the instances with drilldowns to the source of each alert. An alert is an indicator that signifies that a particular metric condition has been encountered. A metric is a unit of measurement used to report the system's conditions.
Set thresholds for alert generation on a cluster database-wide basis. Thresholds are values that indicate whether a warning or critical alert should be initiated.
Monitor performance metrics aggregated across all of the instances or displayed side by side so you can readily compare instances.
Monitor cluster cache coherency statistics to help you identify processing trends and optimize performance for your Oracle RAC environment.
Perform cluster database-wide operations, including initiating back up and recovery operations, starting and stopping instances, and so on.

Enterprise Manager accumulates data over specified periods of time, called collection-based data. Enterprise Manager also provides current data, known as real-time data. The following sections explain how to monitor both types of data.

Collection-Based Monitoring

Enterprise Manager responds to metrics from an Oracle RAC database and publishes alerts when thresholds are exceeded. The following sections explain metrics, alerts, and thresholds.

Metrics

Metrics are units of measurement used to determine the health of a target. It is through the use of metrics and associated thresholds that Enterprise Manager sends out alerts notifying you of problems with the target. A complete list of all the metrics for a target are available by clicking All Metrics on the target's home page. From the displayed All Metrics page, you can view all of the metrics for the target and then drill down to any particular metric for additional information. Examples of Oracle RAC database metric categories are Tablespaces Full, Data Guard Status, and Database Cardinality. Examples of Oracle RAC database instance metric categories are Global Cache Statistics, Instance Throughput, and Alert Log.

Enterprise Manager monitors metrics at the database and instance levels, and any available node can monitor database metrics. However, only one node at a time monitors the entire database while each node monitors metrics for its local instance.

Alerts

An alert indicates a potential problem. Either a warning or critical threshold for a monitored metric has been crossed, or the target is no longer available. You can receive critical alerts when a monitored metric has crossed its critical threshold, or warning alerts when a monitored metric has crossed its warning threshold. Also, you can receive alerts on various target availability states, such as when the:

Target is down
Target is blacked out
Oracle Management Agent that is monitoring the target is unreachable

An alert can also signal that a warning or critical threshold has been crossed. You can define a response action to be executed if a metric reaches the level for warnings or critical severity alerts once a threshold is crossed.

When you receive an alert, you can use Enterprise Manager to obtain more information to help you decide how to address the issue. For example, using the Warning Alerts page, a message for a target host could state that the CPU Utilization is 93.925667%. From this description, you can decide how to best redistribute the CPU load.

Thresholds

Thresholds are boundary values against which monitored metric values are compared. You can specify a warning threshold so that when a monitored metric value crosses that threshold, a warning alert is generated. This alert then notifies you about impending problems that you can address in a timely manner.

If a threshold is exceeded for a metric, then an alert appears on the Cluster Database Home page. You can also copy thresholds from one instance to other instances in Database Control and Grid Control, or you can copy thresholds between different Oracle RAC databases in Grid Control. Database Control does not manage multiple Oracle RAC databases.

Real-Time Performance Monitoring

Enterprise Manager provides performance pages to monitor all levels of a cluster environment, including the cluster, the cluster database, and the cluster database instances. Enterprise Manager uses data from the Automatic Workload Repository (AWR) to display performance information and initiate database alerts. Statistics collected by AWR are aggregated for all instances in an Oracle RAC database and displayed on a summary Enterprise Manager Performance page.

Oracle Enterprise Manager provides several real-time performance charts and drilldowns for the targets that you manage. Enterprise Manager displays both aggregate and instance-specific performance statistics using color-coded charts for easier viewing. To help you identify the source of a problem and resolve it, you can click a legend link next to a chart to display a detail page that provides comprehensive information.

The Cluster Database Home page is your entry point for monitoring Enterprise Manager. From that page, you can access the following Oracle RAC-specific performance pages that show performance trends across one or more cluster databases:

Cluster Database Performance Page
Cluster Database Instance Performance Page
Cluster Performance Page

The following sections describe each of these pages as well as other related pages.

Using the Cluster Database Performance Page

The Cluster Database Performance page provides a quick glimpse of the performance statistics for a database. Statistics are rolled up across all of the instances in the cluster database. Using links that are next to the charts at the bottom of the page, you can drill down to:

Identify the causes of performance issues
Decide whether resources need to be added or redistributed
Tune your SQL plan and schema for better optimization
Resolve performance issues

Figure 13-1 shows the Cluster Database Performance page that you can access by clicking the Performance tab from the Cluster Database Home page.

Figure 13-1 Cluster Database Performance Page

Description of "Figure 13-1 Cluster Database Performance Page"

The following sections provide a description for several important charts, performance monitoring pages, and diagnostic pages.

Cluster Host Load Average Chart

The Cluster Host Load Average chart in the Cluster Database Performance page shows potential problems that are outside of the database. The chart shows maximum, average, and minimum values for available hosts for the previous hour. These values appear at the same level for a cluster that has an evenly balanced load. For Windows-based systems, CPU usage is displayed.

If the load average is higher than the average of the total number of CPUs across all of the hosts in the cluster, then too many processes are waiting for CPU resources. SQL statements that are not tuned often cause high CPU usage. Compare the load average values with the values displayed for CPU Used in the Average Active Sessions chart. If the sessions value is low and the load average value is high, then this indicates that something else on the host, other than your database, is consuming the CPU.

Global Cache Block Access Latency Chart

The Global Cache Block Access Latency summary chart in the Cluster Database Performance page shows the end-to-end elapsed time or latency for a block request.

If accessing a database block of any class does not locate a buffered copy in the local cache, a global cache operation is initiated. Before reading a block from disk, an attempt is made to find the block in the buffer cache of another instance. If the block is in another instance, a version of the block may be shipped. Two different types of blocks are distinguished: current and consistent read blocks.

Long latencies can be caused by:

A high number of requests caused by SQL statements that are not tuned.
A large number of processes in the run queue waiting for CPU or scheduling delays.
Platform-specific operating system parameter settings that affect IPC buffering or process scheduling.
Slow, busy, or faulty interconnects. In these cases, look for dropped packets, retransmits, or cyclic redundancy check (CRC) errors. Ensure that the network is private and that inter-instance traffic is not routed through a public network. You can access the Cluster Interconnects page from the Cluster Cache Coherency page to monitor the transfer and related statistics for interconnects.

Concurrent read and write activity on shared data in a cluster is a frequently occurring activity. Depending on the service requirements, this activity does not normally cause performance problems. However, when global cache requests cause a performance problem, optimizing SQL plans and the schema to achieve effective local cache hit efficiency and minimize I/O is a successful strategy for performance tuning. If the value for CR and current block request time reaches 10 ms, then your first step in resolving the problem should be to drill down to the Cluster Cache Coherency page for more detailed information.

To access the Cluster Cache Coherency page, click Cluster Cache Coherency in the Additional Monitoring Links section of the Performance page. You can alternatively click either of the legends to the right of the Global Cache Block Access Latency chart. The Cluster Cache Coherency page appears, as shown in Figure 13-2, which contains summary charts for cache coherency metrics for the cluster.

Figure 13-2 Cluster Cache Coherency Page

Description of "Figure 13-2 Cluster Cache Coherency Page"

Table 13-1 provides a description of the Cluster Cache Coherency charts and their drilldowns for accessing more comprehensive information for problem resolution.

For additional information about interpreting these charts and metrics, click Help in the Cluster Cache Coherency page.

Table 13-1 Cluster Cache Coherency Charts

Name	Description
Global Cache Block Access Latency	Shows the end-to-end elapsed time or latency for a block request. Click one of the legends to the right of the chart to go to the Global Cache Block Receive Time by Instance page. From there, you can click an instance legend under the chart to go to the Block Transfer for Local Instance page, where you can identify block classes that are subject to intense global cache activity. You can also view the types of blocks that are being transferred, as well as which instances are transferring most of the blocks. Cache transfer indicates how many current and CR blocks per block class were received from remote instances, including how many transfers incurred a delay (busy) or an unexpected longer delay (congested).
Global Cache Block Transfer Rate	Shows the total aggregated number of data blocks received by all instances in the cluster by way of an interconnect. Click one of the legends to the right of the chart to go to the Global Cache Blocks Received by Instance page. From there, you can click an instance legend under the chart to go to the Segment Statistics by Instance page, where you can see segments causing cache contention.
Global Cache Block Transfers and Physical Reads	Shows the percentage of logical reads that read data from the buffer cache of other instances by way of Direct Memory Access and from disk. It is essentially a profile of how much work is performed in the local buffer cache, rather than the portion of non-local references that incur some latency overhead. Click the legends to the right of the chart to go to the Global Cache Block Transfers vs. Logical Reads by Instance and Physical Reads vs. Logical Reads by Instance pages. From there, you can click an instance legend under the chart to go to the Segment Statistics by Instance page, where you can see segments causing cache contention.

Name

Description

Global Cache Block Access Latency

Shows the end-to-end elapsed time or latency for a block request. Click one of the legends to the right of the chart to go to the Global Cache Block Receive Time by Instance page. From there, you can click an instance legend under the chart to go to the Block Transfer for Local Instance page, where you can identify block classes that are subject to intense global cache activity. You can also view the types of blocks that are being transferred, as well as which instances are transferring most of the blocks.

Cache transfer indicates how many current and CR blocks per block class were received from remote instances, including how many transfers incurred a delay (busy) or an unexpected longer delay (congested).

Global Cache Block Transfer Rate

Shows the total aggregated number of data blocks received by all instances in the cluster by way of an interconnect. Click one of the legends to the right of the chart to go to the Global Cache Blocks Received by Instance page. From there, you can click an instance legend under the chart to go to the Segment Statistics by Instance page, where you can see segments causing cache contention.

Global Cache Block Transfers and Physical Reads

Shows the percentage of logical reads that read data from the buffer cache of other instances by way of Direct Memory Access and from disk. It is essentially a profile of how much work is performed in the local buffer cache, rather than the portion of non-local references that incur some latency overhead. Click the legends to the right of the chart to go to the Global Cache Block Transfers vs. Logical Reads by Instance and Physical Reads vs. Logical Reads by Instance pages. From there, you can click an instance legend under the chart to go to the Segment Statistics by Instance page, where you can see segments causing cache contention.

Average Active Sessions Chart

The Average Active Sessions chart in the Cluster Database Performance page shows potential problems inside the database. Categories, called wait classes, show how much of the database is waiting for a resource, such as CPU or disk I/O. Comparing CPU time to wait time helps to determine how much of the response time is consumed with useful work rather than waiting for resources that are potentially held by other processes.

The chart displays the load on the instance and identifies performance bottlenecks. At the cluster database level, this chart shows the aggregate wait class statistics across all of the instances.

The Session Count is computed by categorizing the time that all active sessions consumed in the last sampling interval into different wait classes, summing the amount in each wait class, and dividing it by the sampling interval. For example, If there are 3 active sessions in the last sampling interval (15 seconds), and each session spent half of the time on CPU (7.5 seconds each) and half of the time idling, then 3 x 7.5 / 15 = 1.5 active sessions on CPU appear for that sampling interval.

Compare the peaks on the Average Active Sessions chart with those on the Database Throughput charts. If the Average Active Sessions chart displays a large number of sessions waiting, indicating internal contention, but throughput is high, then the situation may be acceptable. The database is probably also performing efficiently if internal contention is low but throughput is high. However, if internal contention is high but throughput is low, then consider tuning the database.

The wait class legends beside the Average Active Sessions chart enable you to drill down to instance-level information stored in Active Sessions by Instance pages. These pages show the service times for up to four top instances. From these pages, you can select an instance of interest to see wait events, top SQL, top sessions, top services, top modules, and top actions if you need to diagnose and fix problems causing higher waits in a specific category.

Database Throughput Charts

The Database Throughput charts summarize any contentions that appear in the Average Active Sessions chart, and also show how much work the database is performing on behalf of the user. The Per Second view is for databases that handle SQL queries, shown as Physical Reads in the bottom chart. The Per Transaction view is for databases that handle transactions, shown as Transactions in the top chart and Redo Size in the bottom chart. Logons show how many users are logged onto the database per second.

You can also obtain information at the instance level by clicking one of the legends to the right of the charts to access the Database Throughput by Instance page. This page shows the breakdown for all active instances of the rolled-up Database Throughput chart on the Cluster Database Performance page. You can use this page to view the throughput for a particular instance, which may help you diagnose throughput problems.

You can drill down further to see the sessions of an instance consuming the greatest resources. Click an instance name legend just under the chart to go to the Top Sessions page. For more information on this page, see "Top Sessions Page".

Top Consumers Page

The Top Consumers page provides access to several tabs that enable you to view real-time or collection-based data for the top database consumers (top services, top modules, top clients, and top actions) of system resources. You can access the Top Consumers page by clicking the Top Consumers link in the Additional Monitoring Links section of the Cluster Database Performance page.

By default, the Top Consumers page initially displays the Overview tab, which shows aggregated summary data for all top consumers. For instance-level information about a consumer, click the portion of a chart representing the consumer or click the link under the chart for that consumer. The page that appears shows the running instances that are serving the consumer. You can expand the names in the Module column to show data for individual instances.

Top Sessions Page

The Top Sessions page shows a real-time summary list of sessions based on aggregated data. You can see which sessions have consumed the greatest amount of system resources and then decide whether you want to stop the sessions. You can access the Top Sessions page by clicking the Top Sessions link in the Additional Monitoring Links section of the Cluster Database Performance page.

The summary list shows sessions for all instances. You can narrow the display to a particular instance by clicking the name in the Instance Name column. When you click the name, the Top Sessions page for the database instance appears and lists only sessions that are applicable to that instance.

You can see session details for a user by clicking a number assigned to the user name in the SID (Session ID) column. The Session Details page displays complete information about all aspects of a particular session.

Instance Activity Page

The Instance Activity page enables you to view instance activity for several metrics within general metric categories, such as cursors, transactions, sessions, logical I/O, physical I/O, and net I/O. You can access top sessions statistics for a particular metric by clicking a metric legend under the chart if in Graphic mode, or by clicking a name in the summary table if in Tabular mode. You can also use the Switch Database Instance drop-down menu to toggle from one database instance to another. You can view data on a per-second or per-transaction basis. You can access this page by clicking the Instance Activity link in the Additional Monitoring Links section at the bottom of the Cluster Database Performance page.

Top Segments Page

Collecting and viewing segment-level statistics is an effective method for identifying hot tables or indexes in a database. The Top Segments page enables you to gather segment-level statistics to identify performance problems associated with individual segments. This page is particularly useful for Oracle RAC, because it also tracks the number of CR and current blocks received by an object. A high number of current blocks received plus a high number of buffer waits may indicate potential contention.

You can access the Top Segments page by clicking the Top Segments link in the Additional Monitoring Links section. You can view segments for all instances, or use a filter to see segments for a specific instance.

Database Locks Page

Use the Database Locks page to determine whether multiple instances are holding locks for the same object. The page shows blocking locks, user locks, or all database locks. You can use this information to stop a session that is unnecessarily locking an object. You can access the Database Locks page by clicking the Database Locks link in the Additional Monitoring Links section.

Using the Cluster Database Instance Performance Page

To access the Cluster Database Instance Performance page, click the Performance tab in the Cluster Database Home page, and then click an instance name at the bottom of the page.

Page Characteristics

Statistics are rolled up for the instance. You can drill down from links that appear alongside the charts, or at the bottom of the page, to perform the following tasks:

Identify the cause of performance issues.
Decide whether resources need to be added or redistributed to resolve performance issues.
Tune your SQL plan and schema for better optimization.
Resolve performance issues.

The Cluster Database Instance Performance page is essentially the same as the Cluster Database Performance page with the following exceptions:

This page shows performance data for a single instance, rather than enabling you to access data for all instances in a cluster database.
You can access the Top Activity page, which displays all wait classes and related statistics for Active Sessions, Top SQL, Top Sessions, Top Files, and Top Objects for an instance.
An Instance Disk I/O chart shows the total requests that represent the rate at which the database instance is issuing read/write requests.
You can run Automatic Database Diagnostic Monitor (ADDM) and Active Session History (ASH) reports for an instance.

See Also:

Oracle Database Performance Tuning Guide for more information about managing and monitoring database instances

Creating Reports

The Cluster Database Instance Performance page provides the means for running ADDM and ASH reports.

ADDM Report: enables you to create a new AWR snapshot and run ADDM on this and the previous snapshot. After you run ADDM, you can drill down for a detailed performance analysis and report page.
ASH Report: enables you to create a performance data report of the database based on session-sampling data over a specified period of time. This report is very useful for diagnosing small (five- to ten-minute) performance spikes that might be averaged out and consequently hidden or minimized by other issues in the 1-hour AWR report.

Using the Cluster Performance Page

To access the Cluster Performance page, click the Cluster link in the General section of the Cluster Database Performance page, and then click the Performance tab from the Cluster Home page.

The Cluster Performance Page displays utilization statistics, such as CPU, Memory, and Disk I/O, during the past hour for all hosts of a cluster, which is part of the greater Enterprise Manager environment. With this information, you can determine whether resources need to be added, suspended, or redistributed.

You can view the CPU, Memory, and Disk I/O charts for each host individually by clicking the host name in the legend to the right of the charts. The resulting Host page also shows statistics for CPU load, memory scan page rate, and longest service time.

Using the Cluster Interconnects Page

The Cluster Interconnects page enables you to monitor the interconnect interfaces, determine configuration issues, and identify transfer rate-related issues including excess traffic. This page helps determine the load added by individual instances and databases on the interconnect. Sometimes you can immediately identify interconnect delays that are due to applications that are outside Oracle Database.

To access this page, click the Cluster link in the General section of the Cluster Database Home page, then click the Interconnects tab. The Cluster Interconnects page appears, as shown in Figure 13-3.

Figure 13-3 Cluster Interconnects Page

Description of "Figure 13-3 Cluster Interconnects Page"

The Cluster Interconnects page contains the following summary tables:

Interfaces by Hosts
Interfaces in Use by Cluster Databases

These tables enable you to perform the following tasks:

View all interfaces that are configured across the cluster.
View statistics for the interfaces, such as absolute transfer rates and errors.
Determine the type of interfaces, such as private or public.
Determine whether the instance is using a public or private network.
Determine which database instance is currently using which interface.
Determine how much the instance is contributing to the transfer rate on the interface.

Note:

Low transfer rates or an incorrect configuration, such as using the public network instead of the private network for the interconnect, generates metric alerts.