Skip Headers
Oracle® Database Oracle Clusterware and Oracle Real Application Clusters Installation Guide
10g Release 2 (10.2) for AIX

Part Number B14201-04
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

4 Installing Oracle Clusterware

This chapter describes the procedures for installing Oracle Clusterware on AIX. If you are installing Oracle Database 10g Real Application Clusters, then this is phase one of a two-phase installation. The topics in this chapter are:

4.1 Verifying Oracle Clusterware Requirements with CVU

Using the following command syntax, start the Cluster Verification Utility (CVU) to check system requirements for installing Oracle Clusterware:

/mountpoint/crs/Disk1/cluvfy/runcluvfy.sh stage -pre crsinst -n node_list 

In the preceding syntax example, replace the variable mountpoint with the installation media mountpoint, and replace the variable node_list with the names of the nodes in your cluster, separated by commas.

For example, for a cluster with mountpoint /dev/dvdrom/, and with nodes node1, node2, and node3, enter the following command:

/dev/dvdrom/crs/Disk1/cluvfy/runcluvfy.sh stage -pre crsinst -n node1,node2,node3

The CVU Oracle Clusterware pre-installation stage check verifies the following:

4.1.1 Troubleshooting Oracle Clusterware Setup

If the CVU report indicates that your system fails to meet the requirements for Oracle Clusterware installation, then use the topics in this section to correct the problem or problems indicated in the report, and run the CVU command again.

User Equivalence Check Failed
Cause: Failure to establish user equivalency across all nodes. This can be due to not creating the required users, or failing to complete secure shell (SSH) configuration properly.
Action: The CVU provides a list of nodes on which user equivalence failed. For each node listed as a failure node, review the oracle user configuration to ensure that the user configuration is properly completed, and that SSH configuration is properly completed.

Use the command su - oracle and check user equivalence manually by running the ssh command on the local node with the date command argument using the following syntax:

ssh node_name date

The output from this command should be the timestamp of the remote node identified by the value that you use for node_name. If ssh is in the default location, the /usr/bin directory, then use ssh to configure user equivalence. You can also use rsh to confirm user equivalence.

If you have not attempted to use SSH to connect to the host node before running, then CVU indicates a user equivalence error. If you see a message similar to the following when entering the date command with SSH, then this is the probable cause of the user equivalence error:

The authenticity of host 'node1 (140.87.152.153)' can't be established.
RSA key fingerprint is 7z:ez:e7:f6:f4:f2:4f:8f:9z:79:85:62:20:90:92:z9.
Are you sure you want to continue connecting (yes/no)?

Enter yes, and then run CVU again to determine if the user equivalency error is resolved.

If ssh is in a location other than the default, /usr/bin, then CVU reports a user equivalence check failure. To avoid this error, navigate to the directory $CV_HOME/cv/admin, open the file cvu_config with a text editor, and add or update the key ORACLE_SRVM_REMOTESHELL to indicate the ssh path location on your system. For example:

# Locations for ssh and scp commands
ORACLE_SRVM_REMOTESHELL=/usr/local/bin/ssh
ORACLE_SRVM_REMOTECOPY=/usr/local/bin/scp

Note the following rules for modifying the cvu_config file:

  • Key entries have the syntax name=value

  • Each key entry and the value assigned to the key defines one property only

  • Lines beginning with the number sign (#) are comment lines, and are ignored

  • Lines that do not follow the syntax name=value are ignored

When you have changed the path configuration, run the CVU check again. If ssh is in another location than the default, you also need to start OUI with additional arguments to specify a different location for the remote shell and remote copy commands. Enter runInstaller -help to obtain information about how to use these arguments.

Note:

When you or the Oracle Universal Installer run ssh or rsh commands, including any login or other shell scripts they start, you may see errors about invalid arguments or standard input if the scripts generate any output. You should correct the cause of these errors.

To stop the errors, remove all commands from the oracle user's login scripts that generate output when you run ssh or rsh commands.:

If you see messages about X11 forwarding, then perform step 6 in Chapter 2, "Enabling SSH User Equivalency on Cluster Member Nodes" to resolve this issue.

If you see errors similar to the following:

stty: standard input: Invalid argument
stty: standard input: Invalid argument

These errors are produced if hidden files on the system (for example, .bashrc or .cshrc) contain stty commands. If you see these errors, then refer to Chapter 2, "Preventing Oracle Clusterware Installation Errors Caused by stty Commands" to correct the cause of these errors.

Node Reachability Check or Node Connectivity Check Failed
Cause: One or more nodes in the cluster cannot be reached using TCP/IP protocol, through either the public or private interconnects.
Action: Use the command /usr/sbin/ping address to check each node address. When you find an address that cannot be reached, check your list of public and private addresses to make sure that you have them correctly configured. If you use third-party vendor clusterware, then refer to the vendor documentation for assistance. Ensure that the public and private network interfaces have the same interface names on each node of your cluster.
User Existence Check or User-Group Relationship Check Failed
Cause: The administrative privileges for users and groups required for installation are missing or incorrect.
Action: Use the id command on each node to confirm that the oracle user is created with the correct group membership. Ensure that you have created the required groups, and create or modify the user account on affected nodes to establish required group membership.

See Also:

"Creating Required Operating System Groups and User" in Chapter 2 for instructions about how to create required groups, and how to configure the oracle user.

4.2 Preparing to Install Oracle Clusterware with OUI

Before you install Oracle Clusterware with Oracle Universal Installer (OUI), use the following checklist to ensure that you have all the information you will need during installation, and to ensure that you have completed all tasks that must be done before starting to install Oracle Clusterware. Mark the check box for each task as you complete it, and write down the information needed, so that you can provide it during installation.

4.3 Installing Oracle Clusterware with OUI

This section provides you with information about how to use Oracle Universal Installer (OUI) to install Oracle Clusterware. It contains the following sections:

4.3.1 Running OUI to Install Oracle Clusterware

Complete the following steps to install Oracle Clusterware on your cluster. At any time during installation, if you have a question about what you are being asked to do, click the Help button on the OUI page.

  1. Start the runInstaller command from the clusterware directory on the Oracle Database 10g Release 2 (10.2) installation media. When OUI displays the Welcome page, click Next.

  2. Provide information or run scripts as root when prompted by OUI. If you need assistance during installation, click Help.

  3. After you run root.sh on all the nodes, OUI runs the Oracle Notification Server Configuration Assistant, Oracle Private Interconnect Configuration Assistant, and Cluster Verification Utility. These programs run without user intervention.

When you have verified that your Oracle Clusterware installation is completed successfully, Oracle Clusterware installation is complete.

If you intend to install Oracle Database 10g with RAC, then continue to Chapter 5, "Installing Oracle Database 10g with Oracle Real Application Clusters". If you intend to use Oracle Clusterware by itself, then refer to the single-instance Oracle Database installation guide.

4.3.2 Installing Oracle Clusterware Using a Cluster Configuration File

During installation of Oracle Clusterware, on the Specify Cluster Configuration page, you are given the option either of providing cluster configuration information manually, or of using a cluster configuration file. A cluster configuration file is a text file that you can create before starting OUI, which provides OUI with information about the cluster name and node names that it needs to configure the cluster.

Oracle suggests that you consider using a cluster configuration file if you intend to perform repeated installations on a test cluster, or if you intend to perform an installation on many nodes.

To create a cluster configuration file:

  1. On the installation media, navigate to the directory Disk1/response.

  2. Using a text editor, open the response file crs.rsp, and find the section CLUSTER_CONFIGURATION_FILE.

  3. Follow the directions in that section for creating a cluster configuration file.

4.3.3 Troubleshooting Oracle Clusterware Installation Verification

If the CVU report indicates that your Oracle Clusterware installation has a component issue, then use the topics in this section to correct the problem or problems indicated in the report, and run the CVU command again.

CSS is probably working with a non-clustered, local-only configuration on nodes:
Cause: OCR configuration error. The error message specifies the nodes on which this error is found.

This error occurs when, for each specified node, either the contents of the OCR configuration file ocr.loc cannot be retrieved, or the configuration key local_only is set to TRUE in the configuration file of nodes listed in the error message.

Action: Confirm that Oracle Clusterware was installed on the node. Correct the OCR configuration, if it is incorrect. Also, ensure that you have typed the node name correctly when entering the CVU command.
Unable to obtain OCR integrity details from nodes:
Cause: Unable to run the ocrcheck tool successfully on the nodes listed in the error message.
Action: If the ocrcheck tool indicates an error on only some nodes in the cluster, then OCR is not configured on that set of nodes. If the ocrcheck tool indicates that the OCR integrity check failed on all nodes, then the OCR storage area is corrupted. Refer to Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide for instructions about how to use ocrconfig -repair to resolve this issue.

To configure OCR, you can use ocrconfig -repair, as described in Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide, or you can configure OCR manually.

To configure OCR manually, as the oracle user, enter the following command from the bin directory in the CRS home:

$ ./ocrcheck

To test if the OCR storage area is corrupted, complete the following task:

  1. Enter the following command:

    ocrconfig -showbackups
    
    
  2. View the contents of the OCR file using the following command syntax:

    ocrdump -backupfile OCR_filename
    
    
  3. Select a backup file, and use the following command to attempt to restore the file:

    ocrconfig -restore backupfile
    
    

    If the command returns a failure message, then both the primary OCR and the OCR mirror have failed.

    See Also:

    Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide for additional information about testing and restoring the Oracle Cluster Registry
OCR version is inconsistent amongst the nodes.
Cause: The OCR version does not match on all the cluster member nodes. Either all nodes are not part of the same cluster, or nodes do not point to the same OCR, or an OCR configuration file has been changed manually to an invalid configuration on one or more nodes
Action: Perform the following checks:
  1. Ensure that all listed nodes are part of the cluster.

  2. Use the ocrcheck utility (/crs/home/bin/ocrcheck) to find the location of OCR on each node. Start ocrcheck with one of the following command:

    As root:

    # ocrcheck
    
    

    As the oracle user, or as a user with OSDBA group privileges, from the user home directory:

    $ /crs/home/bin/ocrcheck
    
    
  3. Repair invalid OCR configurations by logging into a node you suspect has a faulty configuration, stopping the CRS daemon, and entering the following command:

    ocrconfig –repair ocrmirror device_name
    
    

    the ocrconfig -repair command changes the OCR configuration only on the node from which you run the command.

    See Also:

    Oracle Database Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide for information about how to use the ocrconfig tool to repair OCR files
Incorrect OCR version found for nodes:
Cause: the OCR version on the specified nodes does not match the version required for Oracle Database 10g Release 2 (10.2).
Action: Follow the same actions described in the preceding error message, "OCR version is inconsistent amongst the nodes.".
OCR integrity is invalid.
Cause: The data integrity of the OCR is invalid, which indicates that OCR storage is corrupted
Action: Follow the same actions described in the preceding error message, "Unable to obtain OCR integrity details from nodes:".
OCR ID is inconsistent among the nodes.
Cause: One or more nodes list the OCR in a different location.
Action: Follow the same actions described in the preceding error message, "OCR version is inconsistent amongst the nodes."

4.3.4 Oracle Clusterware Background Processes

The following processes must be running in your environment after the Oracle Clusterware installation for Oracle Clusterware to function:

  • oprocd: Process monitor for the cluster.

  • evmd: Event manager daemon that starts the racgevt process to manage callouts.

  • ocssd: Manages cluster node membership and runs as oracle user; failure of this process results in node restart.

  • crsd: Performs high availability recovery and management operations such as maintaining the OCR. Also manages application resources and runs as root user and restarts automatically upon failure.