Skip Headers
Oracle® Database System Administration Guide
10g Release 2 (10.2) for IBM z/OS on System z

Part Number B25398-03
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

16 Error Diagnosis and Reporting

This chapter discusses the diagnosis of suspected Oracle database errors and the requirements for documenting these errors to Oracle Support Services. Specific topics in this chapter include documentation requirements, categorization of errors, system dump requirements, and methods of reporting to Oracle Support Services.

For information on z/OS-specific error messages, refer to the Oracle Database Messages Guide for IBM z/OS.

The following topics are included:

16.1 Oracle Support Services

Oracle Support Services acts as the interface to the Oracle database user community. Refer to the applicable Oracle Support Services publications for a discussion of policies and procedures for using their services.

16.2 Providing Error Documentation

During the error resolution cycle, Oracle Support Services might request you provide them with machine readable data. Send machine readable data, not formatted or printed data. Use FTP (File Transfer Protocol) if you plan to send large amounts of data.

If you are requested to send data to Oracle Support Services, then follow the documentation requirements provided in the section "General Documentation Requirements". Failure to follow these requirements might result in the inability to process your information. This could delay the resolution of any errors you are reporting.

16.3 General Documentation Requirements

When you report a suspected error, you might be asked to describe the Oracle database subsystem and z/OS operating system environments in detail. Provide the full version number of each component that has an error. The full version number includes important PUT levels for your z/OS system.

Before you contact Oracle Support Services, ensure this information is available:

In addition to describing the Oracle database operational environment, detailed documentation specific to the error might be required. This might include:

Keep in mind that often more than one error is associated with a single failure. Describe all errors for the failure being reported. If your application uses Pro*C, Pro*COBOL, or another Oracle database Precompiler, then ensure your application displays or prints out all errors it encounters. Without a complete and consistent set of information, diagnosing the problem can be impossible.

16.4 Error Diagnosis

When investigating a potential Oracle server error, start by determining which component is failing, where it is failing, and the error category.

16.4.1 Components

When reporting a problem to Oracle Support Services, identify the component suspected of failure, along with its full version and correct release level. A list of components and their version numbers is documented in the Oracle Database Installation Guide for IBM z/OS and any maintenance tape release bulletin.

16.4.2 Error Categories

Use these error categories to describe the error:

  • Documentation

  • Incorrect output

  • Oracle database external error

  • Abend

  • Program loop

  • Performance

  • Missing functionality

  • Wait state

16.4.2.1 Documentation

When reporting documentation errors, you are asked to provide this information:

  • Document name

  • Document part number

  • Date of publication

  • Page number

Describe the error in detail.

Documentation errors can include both erroneous documentation and omission of required information.

16.4.2.2 Incorrect Output

In general, an incorrect output error exists whenever an Oracle database utility produces a result that differs from written Oracle documentation. When describing errors of incorrect output, you need to describe, in detail, the operation of the function in error. Be prepared to describe your understanding of the proper function, the specific Oracle documentation that describes the proper operation of the function, and a detailed description of the incorrect operation.

If you think you have found a software bug, then be prepared to answer these questions:

  • Does the problem occur in more than one Oracle tool? (Examples of Oracle tools are SQL*Plus and Oracle Developer/2000).

  • What are the exact SQL statements used to reproduce the problem?

  • What are the full version numbers of the Oracle database and related Oracle software?

  • What is the problem and how is it reproduced?

16.4.2.3 Oracle Database External Error

Oracle database error messages are produced whenever an Oracle database utility or the Oracle database kernel detects an error condition. Depending on the circumstances, error messages might be fatal or nonfatal to the utility or kernel.

Be prepared to identify the exact error message and message number received and the complete circumstances surrounding the error.

16.4.2.4 Abend

Any program check in an Oracle database utility or the Oracle database kernel address is considered an error. A system dump is required as documentation in the event of a program check.

Ensure the system dump contains all of the private area of the Oracle database address space. Without it, diagnosis is sometimes impossible.

System abends might or might not indicate a failure of the Oracle database subsystem depending upon circumstances.

The following abends are not considered Oracle database failures:

  • 013 - open failure

  • 122 - cancelled by operator

  • 222 - cancelled by operator

  • 322 - CPU time exceeded

  • 722 - SYSOUT lines exceeded

16.4.2.5 Program Loop

A program loop is evident when the Oracle server task consumes CPU time, but no actual work is performed. This situation is substantially different from an Oracle server task that performs most of its operations in cache (also known as a CPU bound job). CPU bound operations might include large batch sorts, sort merge joins, nested loop joins where the driving table is small enough to fit into the SGA, and so forth.

Any program loop that occurs within an Oracle server or utility address space is considered an Oracle database failure. Loop conditions are rarely experienced and are considered serious errors. The initial diagnostic approach with a loop consists of a system dump. If a task is in a program loop, then ensure the system dump includes all of the private area of the Oracle database address space.

Further diagnosis might be required using z/OS SLIP commands. Oracle Support Services furnishes specific instructions on the use of SLIP depending upon circumstances.

16.4.2.6 Performance

Oracle database system performance is determined by many factors, most of which are not within the control of Oracle Corporation. Considerations such as system load, I/O topology, and database design make the documentation of performance errors difficult.

Provide detailed information about the state of your environment when reporting an error.

Specific documentation might include:

  • CPU type and memory configuration

  • Database topology

  • I/O topology

  • System workload by type

  • Oracle database workload characterization

  • Query execution plans

16.4.2.7 Missing Functionality

Enhancement requests can be opened with Oracle Support Services to request the inclusion of functions and features that Oracle products do not currently have. When opening an enhancement request, describe the specific feature or function to be added to the product and provide a business case to justify the enhancement.

16.4.2.8 Wait State

A wait state occurs whenever a required system resource (for example, an enqueue) is unavailable. Because the task requiring the resource is in a blocked or wait state, little or no CPU time is consumed. However, wait states are not limited to a lack of operating system resources. A wait state can occur within the Oracle server due to an incompatible lock request, high contention on an internal latch, an archive task that halts, and so forth.

A wait state in an Oracle database utility might or might not be considered an Oracle database error. A wait state that occurs due to operating system resource conflicts (for example, multiple requests for a tape mount) is not considered an Oracle database error. If a wait state occurs within an Oracle database utility, then a system dump is required of both the utility and kernel address spaces. If a wait state occurs in the Oracle server address space, then a system dump is required of that address space.

16.4.3 Diagnosing Wait State Problems

Before you contact Oracle Support Services, try these recommended approaches:

  • Check the Oracle database alert log and system console logs for any error messages.

  • Query the V$LOCK dynamic view to determine whether there are any lock conflicts.

  • For a high volume online transaction processing (OLTP) system, query the V$WAITSTAT dynamic view to determine whether there is contention for a class of data blocks. To perform this query, invoke SQL*Plus and enter these commands:

    SQL> CONNECT / AS SYSDBA 
    SQL> SELECT * FROM V$WAITSTAT;
    

    For more information about the V$WAITSTAT view, refer to the Oracle Database Reference.

  • If your database is in ARCHIVELOG mode, then ensure your archive data set destination is not full. If the destination is full, then the archiver task cannot copy any more full redo log data sets, thus preventing any further changes to the database. In this case, all tasks attempting to make changes are eventually placed in a wait state.

If you cannot diagnose the wait state problem using the previous suggestions, then first obtain these dumps and contact Oracle Support Services:

  • An SVC system dump that includes all of the private area of the Oracle database address space.

  • A SYSTEMSTATE dump from within Oracle database. When you contact Oracle Support Services, tell them you performed this task.

    To obtain a SYSTEMSTATE dump, invoke SQL*Plus and perform these commands:

    SQL> CONNECT  / AS SYSDBA 
    SQL> ALTER SESSION SET EVENTS 
    'immediate trace name systemstate level 10'; 
    

    A SYSTEMSTATE dump can be quite large, depending on the number of concurrent connections to the Oracle server, the number of resources each connection holds or requests in any mode, and the size of the SGA. The current cursor and object state for each concurrent connection is also included in the SYSTEMSTATE dump. The size of these dumps can easily be in the multiple of megabytes. A SYSTEMSTATE dump is synonymous with a SYSTEMSTATE trace data set.

16.5 System Dumps

When providing documentation on suspected Oracle database failures, it might be necessary for you to provide a system dump of the Oracle server or utility address spaces. Dumps are initiated through the z/OS operator interface using the DUMP and SLIP commands, or automatically by Oracle database if it detects a problem.

Dumps sent to Oracle Support Services as documentation for suspected errors must not be formatted. Formatted dumps will not be accepted.

When specifying dump parameters in response to a z/OS DUMP COMM=(' ') command, you must include this specification:

PSA,TRT,RGN,CSA

Additional parameters may be required.

16.5.1 System Dump Data Sets

Once a SYS1.DUMPxx data set is created, the system operator is notified whenever a dump to that data set occurs. Because all Oracle database abends are dumped to SYS1.DUMP data sets and are not dynamically allocated, you must ensure a SYS1.DUMP data set is always available.

You must also ensure the SYS1.DUMP data set is large enough to accommodate a dump of two address spaces (the Oracle database address space and the client address space). This allows for a complete dump if an Oracle database utility abends while in cross memory mode. Refer to "Oracle Server Storage Requirements" in Chapter 15, "Oracle Database Performance", for more information about estimating the size of an Oracle database address space.

If a SYS1.DUMP data set is not available, then a dump might be lost.

16.5.2 Operator Initiated Dumps

Operator initiated dumps are accomplished with the z/OS DUMP command, where text is the title you want the dump to have:

DUMP COMM=(text)

After the DUMP command has been issued, you must respond to the system WTOR with the following command:

R xx,[JOBNAME=(jobn)|ASID=(nnn),]SDATA=(PSA,TRT,RGN,CSA)

where:

Table 16-1 Variable Descriptions for Code Example

Variable Description

xx

is the reply identification number.

jobn

is the name of the started task or batch job.

nnn

is the hexadecimal address space identifier of the address space you want to dump.


New operands can be requested.

16.5.3 SLIP

Proper documentation of an Oracle database error might require the setting of a SLIP trap. If this is the case, then contact Oracle Support Services for specific instructions.

16.5.4 TSO System Dumps

TSO symptom dumps provide a preliminary view of an abend condition in an Oracle database utility. A symptom dump is available by issuing the TSO command:

PROFILE WTPMSG

The effect of this command is permanent until you issue the command:

PROFILE NOWTPMSG

16.6 GTF

You might need to use GTF as a diagnostic tool under certain circumstances. Oracle Support Services provides specific instructions if this is the case.