| Page 1 Page
2 Page 3
Build Your Own
Oracle RAC Cluster on Oracle Enterprise Linux and
iSCSI (Continued)
The information in this guide is not validated by Oracle, is not supported by Oracle, and should only be used at your own risk; it is for educational purposes only.
12.
Create "oracle" User and Directories
Perform the following tasks on both Oracle RAC nodes in the cluster!
In this section we will create the oracle UNIX user account,
recommended O/S groups, and all required directories. The following O/S groups will be created:
| Description |
Oracle Privilege |
Oracle Group Name |
UNIX Group name |
| Oracle Inventory and Software Owner |
|
|
oinstall |
| Database Administrator |
SYSDBA |
OSDBA |
dba |
| Database Operator |
SYSOPER |
OSOPER |
oper |
We will be using the Oracle Cluster File System, Release 2 (OCFS2)
to store the files required to be shared for the Oracle Clusterware software.
When using OCFS2, the UID of the UNIX user "oracle" and GID of the
UNIX group "oinstall" must be the same on both of the Oracle RAC nodes
in the cluster. If either the UID or GID are different, the files on the OCFS2 file system
will show up as "unowned" or may even be owned by a different user. For this article, I will use
501 for the "oracle" UID and 501 for the "oinstall" GID.
Note that members of the UNIX group oinstall are considered
the "owners" of the Oracle software. Members of the dba group can administer
Oracle databases, for example starting up and shutting down databases. Members of the
optional group oper have a limited set of database administrative privileges
such as managing and running backups. The default name for this group is oper. To use this group,
choose the "Custom" installation type to install the Oracle database software. In this article, we are
creating the oracle user account to have all responsibilities!
Create Group and User for Oracle
Lets start this section by creating the UNIX oinstall and dba group
and oracle user account:
# groupadd -g 501 oinstall
# groupadd -g 502 dba
# groupadd -g 503 oper
# useradd -m -u 501 -g oinstall -G dba,oper -d /home/oracle -s /bin/bash -c "Oracle Software Owner" oracle
# id oracle
uid=501(oracle) gid=501(oinstall) groups=501(oinstall),502(dba),503(oper)
Set the password for the oracle account:
# passwd oracle
Changing password for user oracle.
New UNIX password: xxxxxxxxxxx
Retype new UNIX password: xxxxxxxxxxx
passwd: all authentication tokens updated successfully.
Verify That the User nobody Exists
Before installing the Oracle software, complete the following procedure to verify that the user
nobody exists on the system:
- To determine if the user exists, enter the following command:
# id nobody
uid=99(nobody) gid=99(nobody) groups=99(nobody)
If this command displays information about the nobody user, then you do not
have to create that user.
- If the user nobody does not exist, then enter the following command to
create it:
# /usr/sbin/useradd nobody
- Repeat this procedure on all the other Oracle RAC nodes in the cluster.
Create the Oracle Base Directory
The next step is to create a new directory that will be used to store
the Oracle Database software. When configuring the
oracle user's environment (later in this section) we will be assigning
the location of this directory to the $ORACLE_BASE environment variable.
The following assumes that the directories are being created in the root
file system. Please note that this is being done for the sake of
simplicity and is not recommended as a general practice. Normally, these
directories would be created on a separate file system.
After the directory is created, you must then specify the correct
owner, group, and permissions for it. Perform the following on both Oracle
RAC nodes:
# mkdir -p /u01/app/oracle
# chown -R oracle:oinstall /u01/app/oracle
# chmod -R 775 /u01/app/oracle
At the end of this procedure, you will have the following:
- /u01 owned by root.
- /u01/app owned by root.
- /u01/app/oracle owned by oracle:oinstall with 775 permissions. This
ownership and permissions enables the OUI to create the oraInventory
directory, in the path /u01/app/oracle/oraInventory.
Create the Oracle Clusterware Home Directory
Next, create a new directory that will be used to store
the Oracle Clusterware software. When configuring the
oracle user's environment (later in this section) we will be assigning
the location of this directory to the $ORA_CRS_HOME environment variable.
As noted in the previous section, the following assumes that the directories are being
created in the root file system. This is being done for the sake of simplicity
and is not recommended as a general practice. Normally, these directories would
be created on a separate file system.
After the directory is created, you must then specify the correct
owner, group, and permissions for it. Perform the following on both Oracle
RAC nodes:
# mkdir -p /u01/app/crs
# chown -R oracle:oinstall /u01/app/crs
# chmod -R 775 /u01/app/crs
At the end of this procedure, you will have the following:
- /u01 owned by root.
- /u01/app owned by root.
- /u01/app/crs owned by oracle:oinstall with 775 permissions. These
permissions are required for Oracle Clusterware installation and are changed
during the installation process.
Create Mount Point for OCFS2 / Clusterware
Let's now create the mount point for the Oracle Cluster File System, Release 2 (OCFS2)
that will be used to store the two Oracle Clusterware shared files.
Perform the following on both Oracle RAC nodes:
# mkdir -p /u02
# chown -R oracle:oinstall /u02
# chmod -R 775 /u02
Create Login Script for oracle User Account
To ensure that the environment is setup correctly for the "oracle" UNIX
userid on both Oracle RAC nodes, use the following .bash_profile:
Note: When you are setting the Oracle environment variables for each Oracle RAC node, ensure to
assign each RAC node a unique Oracle SID! For this example, I used:
- linux1: ORACLE_SID=racdb1
- linux2: ORACLE_SID=racdb2
Login to each node as the oracle user account:
# su - oracle
| .bash_profile for "oracle" User Account |
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
alias ls="ls -FA"
export JAVA_HOME=/usr/local/java
# User specific environment and startup programs
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
export ORA_CRS_HOME=/u01/app/crs
export ORACLE_PATH=$ORACLE_BASE/common/oracle/sql:.:$ORACLE_HOME/rdbms/admin
export CV_JDKHOME=/usr/local/java
# Each RAC node must have a unique ORACLE_SID. (i.e. racdb1, racdb2,...)
export ORACLE_SID=racdb1
export PATH=.:${JAVA_HOME}/bin:${PATH}:$HOME/bin:$ORACLE_HOME/bin
export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
export PATH=${PATH}:$ORACLE_BASE/common/oracle/bin
export ORACLE_TERM=xterm
export TNS_ADMIN=$ORACLE_HOME/network/admin
export ORA_NLS10=$ORACLE_HOME/nls/data
export NLS_DATE_FORMAT="DD-MON-YYYY HH24:MI:SS"
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
export CLASSPATH=$ORACLE_HOME/JRE
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
export THREADS_FLAG=native
export TEMP=/tmp
export TMPDIR=/tmp |
13.
Configure the Linux Servers for Oracle
Perform the following configuration procedures on both Oracle RAC nodes in the cluster!
The kernel parameters and shell limits discussed in this section will need to be defined on both Oracle RAC nodes in the cluster
every time the machine is booted. This section provides information about setting those
kernel parameters required for Oracle. Instructions for placing them in
a startup script (/etc/sysctl.conf) are included in Section 16
("All Startup Commands for Both Oracle RAC Nodes").
Overview
This section focuses on configuring both Oracle RAC Linux servers -
getting each one prepared for the Oracle RAC 10g installation. This includes
verifying enough swap space, setting shared memory and semaphores, setting the
maximum number of file handles, setting the IP local port range, setting shell
limits for the oracle user, activating all kernel parameters for the system,
and finally how to verify the correct date and time for both nodes in the cluster.
There are several different ways to
configure (set) these parameters. For the purpose of this article, I will
be making all changes permanent (through reboots) by placing all commands
in the /etc/sysctl.conf file.
Swap Space Considerations
- Installing Oracle Database 10g Release 2 on RHEL/OEL 5
requires a minimum of 1024MB of memory. (Note: An inadequate amount of
swap during the installation will cause the Oracle Universal Installer
to either "hang" or "die")
- To check the amount of memory you have, type:
# cat /proc/meminfo | grep MemTotal
MemTotal: 2074084 kB
- To check the amount of swap you have allocated, type:
# cat /proc/meminfo | grep SwapTotal
SwapTotal: 4128760 kB
- If you have less than 2048MB of memory (between your
RAM and SWAP), you can add temporary swap space by creating a temporary
swap file. This way you do not have to use a raw device or even more
drastic, rebuild your system.
As root, make a file that will act as additional
swap space, let's say about 500MB:
# dd if=/dev/zero of=tempswap bs=1k count=500000
Now we should change the file permissions:
# chmod 600 tempswap
Finally we format the "partition" as swap and add
it to the swap space:
# mke2fs tempswap
# mkswap tempswap
# swapon tempswap
Configuring Kernel Parameters
Oracle Database 10g Release 2 on RHEL/OEL 5 requires the kernel parameter settings shown below.
The values given are minimums, so if your system uses a larger value, do not change it:
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default=1048576
net.core.rmem_max=1048576
net.core.wmem_default=262144
net.core.wmem_max=262144
RHEL/OEL 5 already comes configured with default values
defined for the following kernel parameters:
kernel.shmall
kernel.shmmax
net.core.rmem_default
net.core.rmem_max
net.core.wmem_default
net.core.wmem_max
Use the default values if they are the same or larger than the required values.
This article assumes a fresh new install of Oracle Enterprise Linux 5
and as such, many of the required kernel parameters are already set (see above).
This being the case, you can simply copy / paste the
following to both Oracle RAC nodes while logged in as root:
# cat >> /etc/sysctl.conf <<EOF
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
EOF
The above command persisted the required kernel parameters through reboots
by inserting them in the /etc/sysctl.conf startup file.
Linux allows modification of these kernel parameters to the current system
while it is up and running,
so there's no need to reboot the system after making kernel parameter changes.
To activate the new kernel parameter values for the currently running system,
run the following as root on both Oracle RAC nodes in the cluster:
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 4294967295
kernel.shmall = 268435456
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
Verify the new kernel parameter values by running the following on
both Oracle RAC nodes in the cluster:
# /sbin/sysctl -a | grep shm
vm.hugetlb_shm_group = 0
kernel.shmmni = 4096
kernel.shmall = 268435456
kernel.shmmax = 4294967295
# /sbin/sysctl -a | grep sem
kernel.sem = 250 32000 100 128
# /sbin/sysctl -a | grep file-max
fs.file-max = 65536
# /sbin/sysctl -a | grep ip_local_port_range
net.ipv4.ip_local_port_range = 1024 65000
# /sbin/sysctl -a | grep 'core\.[rw]mem'
net.core.rmem_default = 1048576
net.core.wmem_default = 262144
net.core.rmem_max = 1048576
net.core.wmem_max = 262144
Setting Shell Limits for the oracle User
To improve the performance of the software on Linux systems, Oracle
recommends you increase the following shell limits for the oracle user:
|
Shell Limit
|
Item in limits.conf
|
Hard Limit
|
|
Maximum number of open file descriptors
|
nofile
|
65536
|
|
Maximum number of processes available to a single user
|
nproc
|
16384
|
To make these changes, run the following as root:
cat >> /etc/security/limits.conf <<EOF
oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
EOF
cat >> /etc/pam.d/login <<EOF
session required /lib/security/pam_limits.so
EOF
Update the default shell startup file for the "oracle" UNIX account.
- For the Bourne, Bash, or Korn shell, add the
following lines to the /etc/profile file by
running the following command:
cat >> /etc/profile <<EOF
if [ \$USER = "oracle" ]; then
if [ \$SHELL = "/bin/ksh" ]; then
ulimit -p 16384
ulimit -n 65536
else
ulimit -u 16384 -n 65536
fi
umask 022
fi
EOF
- For the C shell (csh or tcsh), add the following
lines to the /etc/csh.login file by running the
following command:
cat >> /etc/csh.login <<EOF
if ( \$USER == "oracle" ) then
limit maxproc 16384
limit descriptors 65536
endif
EOF
Setting the Correct Date and Time on All Cluster Nodes
During the installation of Oracle Clusterware, the Database, and the
Companion CD, the Oracle Universal Installer (OUI) first installs the
software to the local node running the installer (i.e. linux1). The
software is then copied remotely to all of the remaining nodes in the
cluster (i.e. linux2). During the remote copy process, the OUI will
execute the UNIX "tar" command on each of the
remote nodes to extract the files that were archived and copied over.
If the date and time on the node performing the install is greater than
that of the node it is copying to, the OUI will throw an error from the
"tar" command indicating it is attempting to
extract files stamped with a time in the future:
Error while copying directory
/u01/app/crs with exclude file list 'null' to nodes 'linux2'.
[PRKC-1002 : All the submitted commands did not execute successfully]
---------------------------------------------
linux2:
/bin/tar: ./bin/lsnodes: time stamp 2009-07-28 09:21:34 is 735 s in the future
/bin/tar: ./bin/olsnodes: time stamp 2009-07-28 09:21:34 is 735 s in the future
...(more errors on this node)
Please note that although this would seem like a severe error from the
OUI, it can safely be disregarded as a warning. The "tar"
command DOES actually extract the files; however, when you perform a
listing of the files (using ls -l) on the remote
node, they will be missing the time field until the time on the server
is greater than the timestamp of the file.
Before starting any of the above noted installations, ensure that each
member node of the cluster is set as closely as possible to the same
date and time. Oracle strongly recommends using the Network
Time Protocol feature of most operating systems for this
purpose, with both Oracle RAC nodes using the same reference Network Time Protocol
server.
Accessing a Network Time Protocol server, however, may not always be an
option. In this case, when manually setting the date and time for the
nodes in the cluster, ensure that the date and time of the node you are
performing the software installations from (linux1) is less than all
other nodes in the cluster (linux2). I generally use a 20 second
difference as shown in the following example:
Setting the date and time from linux1:
# date -s "7/28/2009 23:00:00"
Setting the date and time from linux2:
# date -s "7/28/2009 23:00:20"
The two-node RAC configuration described in this article does not make
use of a Network Time Protocol server.
14.
Configure the hangcheck-timer Kernel Module
Perform the following configuration procedures on both Oracle RAC nodes in the cluster!
Oracle9i Release 1 (9.0.1) and
Oracle9i Release 2 ( 9.2.0.1) used a userspace watchdog
daemon called watchdogd to monitor the
health of the cluster and to restart a RAC node in case of a failure.
Starting with Oracle9i Release 2 (9.2.0.2) (and
still available in Oracle 10g Release 2), the
watchdog daemon has been deprecated by a Linux kernel module named hangcheck-timer
which addresses availability and reliability problems much better. The hang-check
timer is loaded into the Linux kernel and checks if the system hangs.
It will set a timer and check the timer after a certain amount of time.
There is a configurable threshold to hang-check
that, if exceeded will reboot the machine. Although the hangcheck-timer
module is not required for Oracle Clusterware (Cluster Manager)
operation, it is highly recommended by Oracle.
The hangcheck-timer.ko Module
The hangcheck-timer module uses a kernel-based timer
that periodically checks the system task scheduler to catch delays in
order to determine the health of the system. If the system hangs or
pauses, the timer resets the node. The hangcheck-timer module uses the
Time Stamp Counter (TSC) CPU register, which is incremented at each
clock signal. The TCS offers much more accurate time measurements
because this register is updated by the hardware automatically.
Much more information about the hangcheck-timer project
can be found here.
Installing the hangcheck-timer.ko Module
The hangcheck-timer was normally shipped only by
Oracle, however, this module is now included with Red Hat Linux AS
starting with kernel versions 2.4.9-e.12 and higher. The
hangcheck-timer should already be included. Use the following to ensure
that you have the module included:
# find /lib/modules -name "hangcheck-timer.ko"
/lib/modules/2.6.18-128.el5/kernel/drivers/char/hangcheck-timer.ko
In the above output, we care about the hangcheck timer object (hangcheck-timer.ko)
in the /lib/modules/2.6.18-128.el5/kernel/drivers/char
directory since this is the kernel we are running.
Configuring and Loading the hangcheck-timer Module
There are two key parameters to the hangcheck-timer module:
- hangcheck-tick:
This parameter defines the period of time between checks of system
health. The default value is 60 seconds; Oracle recommends setting it
to 30 seconds.
- hangcheck-margin:
This parameter defines the maximum hang delay that should be tolerated
before hangcheck-timer resets the RAC node. It defines the margin of
error in seconds. The default value is 180 seconds; Oracle recommends
setting it to 180 seconds.
Note: The two hangcheck-timer
module parameters indicate how long a RAC node must hang before it will
reset the system. A node reset will occur when the following is true:
system hang time > (hangcheck_tick + hangcheck_margin)
Configuring Hangcheck Kernel Module Parameters
Each time the hangcheck-timer kernel module is loaded
(manually or by Oracle), it needs to know what value to use for each of
the two parameters we just discussed: (hangcheck-tick
and hangcheck-margin). These values need to be
available after each reboot of the Linux server. To do that, make an
entry with the correct values to the /etc/modprobe.conf
file as follows:
# su -
# echo "options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180" >> /etc/modprobe.conf
Each time the hangcheck-timer kernel module gets loaded, it will use
the values defined by the entry I made in the /etc/modprobe.conf
file.
Manually Loading the Hangcheck Kernel
Module for Testing
Oracle is responsible for loading the hangcheck-timer
kernel module when required. For that reason, it is not required to
perform a modprobe or insmod
of the hangcheck-timer kernel module in any of the startup files (i.e. /etc/rc.local).
It is only out of pure habit that I continue to include
a modprobe of the hangcheck-timer kernel module
in the /etc/rc.local file. Someday I will get
over it, but realize that it does not hurt to include a modprobe
of the hangcheck-timer kernel module during startup.
So to keep myself sane and able to sleep at night, I
always configure the loading of the hangcheck-timer kernel module on
each startup as follows:
# echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.local
Note: You don't have to manually load the
hangcheck-timer kernel module using modprobe or insmod
after each reboot. The hangcheck-timer module
will be loaded by Oracle automatically when needed.
Now, to test the hangcheck-timer kernel module to
verify it is picking up the correct parameters we defined in the /etc/modprobe.conf
file, use the modprobe command. Although you
could load the hangcheck-timer kernel module by
passing it the appropriate parameters (e.g. insmod
hangcheck-timer hangcheck_tick=30 hangcheck_margin=180), we
want to verify that it is picking up the options we set in the /etc/modprobe.conf
file.
To manually load the hangcheck-timer kernel module and
verify it is using the correct values defined in the /etc/modprobe.conf
file, run the following command:
# su -
# modprobe hangcheck-timer
# grep Hangcheck /var/log/messages | tail -2
Jul 30 17:50:43 linux1 kernel: Hangcheck: starting hangcheck timer 0.9.0 (tick is 30 seconds, margin is 180 seconds).
Jul 30 17:50:43 linux1 kernel: Hangcheck: Using get_cycles().
15. Configure RAC Nodes for Remote Access using SSH
Perform the following configuration procedures on both Oracle RAC nodes in the cluster!
Before you can install Oracle RAC 10g, you
must configure secure shell (SSH) for the UNIX user account
you plan to use to install Oracle Clusterware 10g and the
Oracle Database 10g software. The installation and configuration
tasks described in this section will need to be performed on
both Oracle RAC nodes. As configured earlier in this article, the software owner for
Oracle Clusterware 10g and the Oracle Database 10g software
will be "oracle".
The goal here is to
setup user equivalence for the oracle UNIX user
account. User equivalence enables the oracle UNIX user account to
access all other nodes in the cluster (running commands and copying
files) without the need for a password. Oracle added
support in 10g Release 1 for using the SSH tool
suite for setting up user equivalence. Before Oracle Database 10g,
user equivalence had to be configured using remote shell (RSH).
The SSH configuration described in this article uses SSH1. If SSH is not
available, then OUI attempts to use rsh and rcp instead.
These services, however, are disabled by default on most Linux systems.
The use of RSH will not be discussed in this article.
You need either an RSA or a DSA key for the SSH
protocol. RSA is used with the SSH 1.5 protocol, while DSA is the default
for the SSH 2.0 protocol. With OpenSSH, you can
use either RSA or DSA. For the purpose of this article, we will configure
SSH using SSH1.
Note: If you have an SSH2 installation,
and you cannot use SSH1, then refer to your SSH distribution documentation
to configure SSH1 compatibility or to configure SSH2 with DSA. This type of configuration
is beyond the scope of this article and will not be discussed.
So, why do we have to setup user equivalence? Installing Oracle
Clusterware and the Oracle Database software is only performed from one
node in a RAC cluster. When running the Oracle Universal Installer
(OUI) on that particular node, it will use the ssh
and scp commands to run
remote commands on and copy files (the Oracle software) to all other
nodes within the RAC cluster. The oracle UNIX user account on the
node running the OUI (runInstaller) must be
trusted by all other nodes in your RAC cluster. This means that you
must be able to run the secure shell commands (ssh
or scp) on the Linux server you will be running
the OUI from against all other Linux servers in the cluster without
being prompted for a password.
Please note that the use of secure shell is not required for
normal RAC operation. This configuration, however, must to be enabled
for RAC and patchset installations as well as creating the clustered
database.
The methods required for configuring SSH1, an RSA key, and user equivalence is described
in the following sections.
Configuring the Secure Shell
To determine if SSH is installed and running, enter the following
command:
# pgrep sshd
2400
If SSH is
running, then the response to this command is a list of process ID
number(s). Run this command on both Oracle RAC nodes in the cluster to
verify the SSH daemons are installed and running!
To find out more about SSH, refer to the man page:
# man ssh
Creating the RSA Keys on Both Oracle RAC Nodes
The first step in configuring SSH is to create an RSA public/private key pair on
both Oracle RAC nodes in the cluster. The command to do this will create a public
and private key for RSA (for a total of two keys per
node). The content of the RSA public keys will then need to be
copied into an authorized key file which is then
distributed to both Oracle RAC nodes in the cluster.
Use the following steps to create the RSA key pair. Please
note that these steps will need to be completed on both Oracle RAC nodes
in the cluster:
- Log on as the oracle UNIX user account.
# su - oracle
- If necessary, create the .ssh
directory in the oracle user's home directory and set the correct
permissions to ensure that only the oracle user has read and write
permissions:
$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh
- Enter the following command to generate an RSA
key pair (public and private key) for the SSH protocol:
$ /usr/bin/ssh-keygen -t rsa
At the prompts:
- Accept the default location for the key files (press [ENTER]).
- Enter and confirm a pass phrase.
This should be different from the oracle UNIX user account password
however it is not a requirement.
This command will write the public key to the ~/.ssh/id_rsa.pub
file and the private key to the ~/.ssh/id_rsa
file. Note that you should never distribute the
private key to anyone!
- Repeat the above steps for both Oracle RAC nodes in the cluster.
Now that both Oracle RAC nodes contain a public and private key for RSA,
you will need to create an authorized key file on
one of the nodes. An authorized key file is nothing more than a single
file that contains a copy of everyone's (every node's) RSA public key.
Once the authorized key file contains all of the public
keys, it is then distributed to all other nodes in the cluster.
Complete the following steps on one of the nodes in the cluster to
create and then distribute the authorized key file. For the purpose of
this article, I am using linux1:
- First, determine if an authorized key file already exists on the
node (~/.ssh/authorized_keys).
In most cases this will not exist since this article assumes you are
working with a new install. If the file doesn't exist, create it now:
$ touch ~/.ssh/authorized_keys
$ cd ~/.ssh
$ ls -l *.pub
-rw-r--r-- 1 oracle oinstall 395 Jul 30 18:51 id_rsa.pub
The listing above should show the id_rsa.pub
public key created in the previous section.
- In this step, use SCP (Secure Copy) or SFTP (Secure FTP)
to copy the content of the
~/.ssh/id_rsa.pub public key from both
Oracle RAC nodes in the cluster to the authorized key file
just created (~/.ssh/authorized_keys). Again, this
will be done from linux1. You will be prompted
for the oracle UNIX user account password for both Oracle RAC nodes accessed.
The following example is being run from linux1
and assumes a two-node cluster, with nodes linux1
and linux2:
$ ssh linux1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'linux1 (192.168.1.100)' can't be established.
RSA key fingerprint is 46:0f:1b:ac:a6:9d:86:4d:38:45:85:76:ad:3b:e7:c9.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'linux1,192.168.1.100' (RSA) to the list of known hosts.
oracle@linux1's password: xxxxx
$ ssh linux2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
The authenticity of host 'linux2 (192.168.1.101)' can't be established.
RSA key fingerprint is eb:39:98:a7:d9:61:02:a2:80:3b:75:71:70:42:d0:07.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'linux2,192.168.1.101' (RSA) to the list of known hosts.
oracle@linux2's password: xxxxx
Note: The first time you use SSH to
connect to a node from a particular system, you will see a message
similar to the following:
The authenticity of host 'linux1 (192.168.1.100)' can't be established.
RSA key fingerprint is 46:0f:1b:ac:a6:9d:86:4d:38:45:85:76:ad:3b:e7:c9.
Are you sure you want to continue connecting (yes/no)? yes
Enter yes at the prompt to continue. You will
not see this message again when you connect from this system to the
same node.
- At this point, we have the RSA
public key from every node in the cluster in the authorized
key file (~/.ssh/authorized_keys) on linux1.
We now need to copy it to the remaining nodes in the cluster. In our
two-node cluster example, the only remaining node is linux2.
Use the scp command to copy the authorized key
file to all remaining nodes in the cluster:
$ scp ~/.ssh/authorized_keys linux2:.ssh/authorized_keys
oracle@linux2's password: xxxxx
authorized_keys 100% 790 0.8KB/s 00:00
- Change the permission of the authorized key file
for both Oracle RAC nodes in the cluster by logging into the node and running the
following:
$ chmod 600 ~/.ssh/authorized_keys
- At this point, if you use ssh
to log in to or run a command on another node, you are prompted for the
pass phrase that you specified when you created the RSA key. For
example, test the following from linux1:
$ ssh linux1 hostname
Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
linux1
$ ssh linux2 hostname
Enter passphrase for key '/home/oracle/.ssh/id_rsa': xxxxx
linux2
Note: If you see any other messages or
text, apart from the host name, then the Oracle installation will fail.
Make any changes required to ensure that only the host name is
displayed when you enter these commands. You should ensure that any
part of a login script that generates any output, or asks any
questions, is modified so it acts only when the shell is an
interactive shell.
Enabling SSH User Equivalency for the Current Shell Session
When running the OUI, it will need to run the secure shell tool
commands (ssh and scp)
without being prompted for a pass phrase. Even though SSH is configured
on both Oracle RAC nodes in the cluster, using the secure shell tool commands will
still prompt for a pass phrase. Before running the OUI, you need to
enable user equivalence for the terminal session you plan to run the
OUI from. For the purpose of this article, all Oracle installations
will be performed from linux1.
User equivalence will need to be enabled on any new terminal shell session
before attempting to run the OUI. If you log out and log back in to the
node you will be performing the Oracle installation from, you must
enable user equivalence for the terminal shell session as this is not
done by default.
To enable user equivalence for the current terminal shell session, perform
the following steps:
- Log on to the node where you want to run the OUI from (linux1)
as the oracle UNIX user account.
# su - oracle
- Enter the following commands:
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
At the prompts, enter the pass phrase for each key that you generated.
- If SSH is configured correctly, you will be able
to use the ssh and scp
commands without being prompted for a password or pass phrase from this
terminal session:
$ ssh linux1 "date;hostname"
Thu Jul 30 19:32:49 EDT 2009
linux1
$ ssh linux2 "date;hostname"
Thu Jul 30 19:33:06 EDT 2009
linux2
Note: The commands above should
display the date set on both Oracle RAC nodes along with its hostname. If any of
the nodes prompt for a password or pass phrase then verify that the ~/.ssh/authorized_keys
file on that node contains the correct public keys. Also, if you see
any other messages or text, apart from the date and hostname, then the
Oracle installation will fail. Make any changes required to ensure that
only the date and hostname is displayed when you enter these commands. You should ensure that any
part of a login script that generates any output, or asks any
questions, is modified so it acts only when the shell is an
interactive shell.
- The Oracle Universal Installer is a GUI
interface and requires the use of an X Server. From the terminal
session enabled for user equivalence (the node you will be performing
the Oracle installations from), set the environment variable DISPLAY
to a valid X Windows display:
Bourne, Korn, and Bash shells:
$ DISPLAY=<Any X-Windows Host>:0 $ export DISPLAY
C shell:
$ setenv DISPLAY <Any X-Windows Host>:0
After setting the DISPLAY variable to a valid X
Windows display, you should perform another test of the current
terminal session to ensure that X11 forwarding is
not enabled:
$ ssh linux1 hostname
linux1
$ ssh linux2 hostname
linux2
Note: If you are using a remote client
to connect to the node performing the installation, and you see a
message similar to: "Warning: No xauth data; using fake
authentication data for X11 forwarding." then this means
that your authorized keys file is configured correctly; however, your
SSH configuration has X11 forwarding enabled. For
example:
$ export DISPLAY=melody:0
$ ssh linux2 hostname
Warning: No xauth data; using fake authentication data for X11 forwarding.
linux2
Note that having X11 Forwarding enabled will cause the Oracle
installation to fail. To correct this problem, create a user-level SSH
client configuration file for the oracle UNIX user account that
disables X11 Forwarding:
- Using a text editor, edit or create the file
~/.ssh/config
- Make sure that the ForwardX11
attribute is set to no. For example, insert the
following into the ~/.ssh/config file:
Host * ForwardX11 no
- You must run the Oracle Universal Installer
from this terminal session or remember to repeat the steps to enable
user equivalence (steps 2, 3, and 4 from this section) before you start
the Oracle Universal Installer from a different terminal session.
Remove any stty Commands
When installing the Oracle software, any hidden files on the system
(i.e. .bashrc, .cshrc, .profile)
will cause the installation process to fail if they contain stty
commands.
To avoid this problem, you must modify these files to suppress all
output on STDERR as in the following examples:
Note: If there are hidden files that
contain stty commands that are loaded by the
remote shell, then OUI indicates an error and stops the installation.
16. All Startup Commands for Both Oracle RAC Nodes
Verify that the following startup commands are
included on both of the Oracle RAC nodes in the cluster!
Up to this point, we have talked in great detail about the parameters and resources that
need to be configured on both nodes in the Oracle RAC 10g configuration.
This section will review those parameters, commands, and entries
(in previous sections of this document) that need to occur on both
Oracle RAC nodes when they are booted.
For each of the startup files below, entries in gray
should be included in each startup file.
/etc/modprobe.conf
(All parameters and values to be used by kernel modules.)
.................................................................
alias eth0 r8169
alias eth1 e1000
alias scsi_hostadapter ata_piix
alias snd-card-0 snd-intel8x0
options snd-card-0 index=0
options snd-intel8x0 index=0
remove snd-intel8x0 { /usr/sbin/alsactl store 0 >/dev/null 2>&1 || : ; }; /sbin/modprobe -r --ignore-remove snd-intel8x0
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
.................................................................
/etc/sysctl.conf
(We wanted to adjust the default and maximum
send buffer size as well as the default and maximum receive buffer size
for the interconnect. This file also contains those parameters
responsible for configuring shared memory, semaphores, file handles,
and local IP range for use by the Oracle instance.)
.................................................................
# Kernel sysctl configuration file for Oracle Enterprise Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_filter = 1
# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename
# Useful for debugging multi-threaded applications
kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Controls the maximum size of a message, in bytes
kernel.msgmnb = 65536
# Controls the default maxmimum size of a mesage queue
kernel.msgmax = 65536
# +---------------------------------------------------------+
# | ADJUSTING NETWORK SETTINGS |
# +---------------------------------------------------------+
# | With Oracle 9.2.0.1 and onwards, Oracle now makes use |
# | of UDP as the default protocol on Linux for |
# | inter-process communication (IPC), such as Cache Fusion |
# | and Cluster Manager buffer transfers between instances |
# | within the RAC cluster. Oracle strongly suggests to |
# | adjust the default and maximum receive buffer size |
# | (SO_RCVBUF socket option) to 1024 MB, and the default |
# | and maximum send buffer size (SO_SNDBUF socket option) |
# | to 256 KB. The receive buffers are used by TCP and UDP |
# | to hold received data until it is read by the |
# | application. The receive buffer cannot overflow because |
# | the peer is not allowed to send data beyond the buffer |
# | size window. This means that datagrams will be |
# | discarded if they don't fit in the socket receive |
# | buffer. This could cause the sender to overwhelm the |
# | receiver. |
# +---------------------------------------------------------+
# +---------------------------------------------------------+
# | Default setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option. |
# +---------------------------------------------------------+
net.core.rmem_default=1048576
# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "receive" buffer |
# | which may be set by using the SO_RCVBUF socket option. |
# +---------------------------------------------------------+
net.core.rmem_max=1048576
# +---------------------------------------------------------+
# | Default setting in bytes of the socket "send" buffer |
# | which may be set by using the SO_SNDBUF socket option. |
# +---------------------------------------------------------+
net.core.wmem_default=262144
# +---------------------------------------------------------+
# | Maximum setting in bytes of the socket "send" buffer |
# | which may be set by using the SO_SNDBUF socket option. |
# +---------------------------------------------------------+
net.core.wmem_max=262144
# +---------------------------------------------------------+
# | ADJUSTING ADDITIONAL KERNEL PARAMETERS FOR ORACLE |
# +---------------------------------------------------------+
# | Configure the kernel parameters for all Oracle Linux |
# | servers by setting shared memory and semaphores, |
# | setting the maximum amount of file handles, and setting |
# | the IP local port range. |
# +---------------------------------------------------------+
# +---------------------------------------------------------+
# | SHARED MEMORY |
# +---------------------------------------------------------+
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 4294967295
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 268435456
# Controls the maximum number of shared memory segments system wide
kernel.shmmni = 4096
# +---------------------------------------------------------+
# | SEMAPHORES |
# | ---------- |
# | |
# | SEMMSL_value SEMMNS_value SEMOPM_value SEMMNI_value |
# | |
# +---------------------------------------------------------+
kernel.sem=250 32000 100 128
# +---------------------------------------------------------+
# | FILE HANDLES |
# ----------------------------------------------------------+
fs.file-max=65536
# +---------------------------------------------------------+
# | LOCAL IP RANGE |
# ----------------------------------------------------------+
net.ipv4.ip_local_port_range=1024 65000
.................................................................
Note: Verify that each of the
required kernel parameters (above) are configured in the
/etc/sysctl.conf file. Then, ensure that each of these parameters are truly in
effect by running the following command on both Oracle RAC nodes in the cluster:
# sysctl -p
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
kernel.shmmax = 4294967295
kernel.shmall = 268435456
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
/etc/hosts
(All machine/IP entries for nodes in our RAC cluster.)
.................................................................
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
# Public Network - (eth0)
192.168.1.100 linux1
192.168.1.101 linux2
# Private Interconnect - (eth1)
192.168.2.100 linux1-priv
192.168.2.101 linux2-priv
# Public Virtual IP (VIP) addresses - (eth0:1)
192.168.1.200 linux1-vip
192.168.1.201 linux2-vip
# Private Storage Network for Openfiler - (eth1)
192.168.1.195 openfiler1
192.168.2.195 openfiler1-priv
# Miscellaneous Nodes
192.168.1.1 router
192.168.1.102 alex
192.168.1.103 nascar
192.168.1.105 packmule
192.168.1.106 melody
192.168.1.120 cartman
192.168.1.121 domo
192.168.1.122 switch1
192.168.1.190 george
192.168.1.245 accesspoint
.................................................................
/etc/udev/rules.d/55-openiscsi.rules
.................................................................
# /etc/udev/rules.d/55-openiscsi.rules
KERNEL=="sd*", BUS=="scsi", PROGRAM="/etc/udev/scripts/iscsidev.sh %b",SYMLINK+="iscsi/%c/part%n"
.................................................................
/etc/udev/scripts/iscsidev.sh
.................................................................
#!/bin/sh
# FILE: /etc/udev/scripts/iscsidev.sh
BUS=${1}
HOST=${BUS%%:*}
[ -e /sys/class/iscsi_host ] || exit 1
file="/sys/class/iscsi_host/host${HOST}/device/session*/iscsi_session*/targetname"
target_name=$(cat ${file})
# This is not an open-scsi drive
if [ -z "${target_name}" ]; then
exit 1
fi
echo "${target_name##*.}"
.................................................................
/etc/rc.local
(Loading the hangcheck-timer kernel module.)
.................................................................
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
touch /var/lock/subsys/local
# +---------------------------------------------------------+
# | HANGCHECK TIMER |
# | (I do not believe this is required, but doesn't hurt) |
# +---------------------------------------------------------+
/sbin/modprobe hangcheck-timer
.................................................................
17.
Install & Configure Oracle Cluster File System (OCFS2)
Most of the installation and configuration procedures in this section
should be performed on both Oracle RAC nodes in the cluster!
Creating the OCFS2 filesystem, however, should only be executed on one of
nodes in the RAC cluster.
It is now time to install and configure the Oracle Cluster File System, Release 2 (OCFS2) software.
Developed by Oracle Corporation, OCFS2 is a Cluster File System which allows all nodes in a cluster
to concurrently access a device via the standard file system interface.
This allows for easy management of applications that need to run across a
cluster.
OCFS Release 1 was released in December 2002 to enable Oracle Real
Application Cluster (RAC) users to run the clustered database without
having to deal with RAW devices. The file system was designed to store
database related files, such as data files, control files, redo logs,
archive logs, etc. OCFS2 is the next generation of the Oracle Cluster
File System. It has been designed to be a general purpose cluster file
system. With it, users can store not only database related files on a shared
disk, but also store Oracle binaries and configuration files (a shared Oracle Home for example)
making management of RAC even easier.
In this guide, you will be using the release of OCFS2 included with
Oracle Enterprise Linux Release 5.3 (OCFS2 Release 1.2.9-1)
to store the two files that are required to be shared by the Oracle
Clusterware software. Along with these two files, you will also be
using this space to store the shared SPFILE for all Oracle
ASM instances.
See this
page for more information on OCFS2 (including Installation
Notes) for Linux.
Install OCFS2
In previous editions of this article, this would be the time where you would need to download
the OCFS2 software from http://oss.oracle.com/.
This is no longer necessary since the OCFS2 software is included with Oracle Enterprise Linux.
The OCFS2 software stack includes the following packages:
32-bit (x86) Installations
- OCFS2 Kernel Driver
- ocfs2-x.x.x-x.el5-x.x.x-x.el5.i686.rpm - (for default kernel)
- ocfs2-x.x.x-x.el5PAE-x.x.x-x.el5.i686.rpm - (for PAE kernel)
- ocfs2-x.x.x-x.el5xen-x.x.x-x.el5.i686.rpm - (for xen kernel)
- OCFS2 Tools
- ocfs2-tools-x.x.x-x.el5.i386.rpm
- OCFS2 Tools Development
- ocfs2-tools-devel-x.x.x-x.el5.i386.rpm
- OCFS2 Console
- ocfs2console-x.x.x-x.el5.i386.rpm
64-bit (x86_64) Installations
- OCFS2 Kernel Driver
- ocfs2-x.x.x-x.el5-x.x.x-x.el5.x86_64.rpm - (for default kernel)
- ocfs2-x.x.x-x.el5PAE-x.x.x-x.el5.x86_64.rpm - (for PAE kernel)
- ocfs2-x.x.x-x.el5xen-x.x.x-x.el5.x86_64.rpm - (for xen kernel)
- OCFS2 Tools
- ocfs2-tools-x.x.x-x.el5.x86_64.rpm
- OCFS2 Tools Development
- ocfs2-tools-devel-x.x.x-x.el5.x86_64.rpm
- OCFS2 Console
- ocfs2console-x.x.x-x.el5.x86_64.rpm
With Oracle Enterprise Linux 5.3, the OCFS2 software packages do not get
installed by default. The OCFS2 software packages can be found on CD #3. To determine if
the OCFS2 packages are installed (which in most cases, they will not be),
perform the following on both Oracle RAC nodes:
# rpm -qa | grep ocfs2 | sort
If the OCFS2 packages are not installed, load the Oracle Enterprise Linux CD #3
into each of the Oracle RAC nodes and perform the following:
From Oracle Enterprise Linux 5 - [CD #3]
# mount -r /dev/cdrom /media/cdrom
# cd /media/cdrom/Server
# rpm -Uvh ocfs2-tools-1.2.7-1.el5.i386.rpm
# rpm -Uvh ocfs2-2.6.18-128.el5-1.2.9-1.el5.i686.rpm
# rpm -Uvh ocfs2console-1.2.7-1.el5.i386.rpm
# cd /
# eject
After installing the OCFS2 packages, verify from both Oracle
RAC nodes that the software is installed:
# rpm -qa | grep ocfs2 | sort
ocfs2-2.6.18-128.el5-1.2.9-1.el5
ocfs2console-1.2.7-1.el5
ocfs2-tools-1.2.7-1.el5
Disable SELinux (RHEL4 U2 and higher)
Users of RHEL4 U2 and higher (Oracle Enterprise Linux 5.3 is based on RHEL 5.3) are advised that OCFS2
currently does not work with SELinux enabled. If you are using RHEL4 U2 or higher
(which includes us since we are using Oracle Enterprise Linux 5.3) you will need to
verify SELinux is disabled in order for the O2CB service to execute.
During the installation of Oracle Enterprise Linux, we Disabled SELinux on the
SELinux
screen. If, however, you did not disable SELinux during the installation phase,
you can use the tool system-config-securitylevel to disable SELinux.
To disable SELinux (or verify SELinux is disabled), run the "Security Level
Configuration" GUI utility:
# /usr/bin/system-config-securitylevel &
This will bring up the following screen:

Figure 16: Security Level Configuration Opening Screen / Firewall Disabled
Now, click the SELinux tab and select the "Disabled" option.
After clicking [OK], you will be presented with
a warning dialog. Simply acknowledge this warning by clicking "Yes".
Your screen should now look like the following after disabling the
SELinux option:

Figure 17: SELinux Disabled
If you needed to disable SELinux in this section on any of the nodes,
those nodes will need to be rebooted to implement the change. SELinux must be disabled
before you can continue with configuring OCFS2!
Configure OCFS2
OCFS2 will be configured to use the private network (192.168.2.0)
for all of its network traffic as recommended by Oracle. While OCFS2
does not take much bandwidth, it does require the nodes to be alive on the
network and sends regular keepalive packets to ensure that they are. To
avoid a network delay being interpreted as a node disappearing on the net
which could lead to a node-self-fencing, a private interconnect is recommended.
It is safe to use the same private interconnect for both Oracle RAC and OCFS2.
A popular question then is what node name should be used and should it
be related to the IP address? The node name needs to match the hostname of
the machine. The IP address need not be the one associated with that hostname. In other
words, any valid IP address on that node can be used. OCFS2 will not attempt
to match the node name (hostname) with the specified IP address.
The next step is to generate and configure the /etc/ocfs2/cluster.conf
file on both Oracle RAC nodes in the cluster. The easiest way to accomplish this is
to run the GUI tool ocfs2console. In this
section, we will not only create and configure the /etc/ocfs2/cluster.conf
file using ocfs2console, but will also create and
start the cluster stack O2CB. When the /etc/ocfs2/cluster.conf
file is not present, (as will be the case in our example), the ocfs2console
tool will create this file along with a new cluster stack service
(O2CB) with a default cluster name of ocfs2. This
will need to be done on both Oracle RAC nodes in the cluster as
the root user account:
$ su - # ocfs2console &
This will bring up the GUI as shown below:

Figure 18: ocfs2console GUI
Using the ocfs2console GUI tool, perform the following steps:
- Select [Cluster] -> [Configure Nodes...].
This will start the OCFS2 Cluster Stack (Figure 19)
and bring up the "Node Configuration" dialog.
- On the "Node Configuration" dialog, click the [Add] button.
- This will bring up the "Add Node" dialog.
- In the "Add Node" dialog, enter the
Host name and IP address for the first node in
the cluster. Leave the IP Port set to its default
value of 7777. In my example, I added both nodes using linux1
/ 192.168.2.100 for the first node and linux2
/ 192.168.2.101 for the second node.
Note: The node name you enter "must" match the hostname of the machine
and the IP addresses will use the private interconnect.
- Click [Apply] on the "Node Configuration" dialog
- All nodes should now be "Active" as shown in
Figure 20.
- Click [Close] on the "Node Configuration" dialog.
- After verifying all values are correct, exit the
application using [File] -> [Quit]. This
needs to be performed on both Oracle RAC nodes in the cluster.

Figure 19: Starting the OCFS2 Cluster Stack
The following dialog show the OCFS2 settings I used for
the nodes linux1 and linux2:

Figure 20: Configuring Nodes for OCFS2
Note: See the
Troubleshooting
section if you get the error:
o2cb_ctl: Unable to access cluster service while creating node
After exiting the ocfs2console,
you will have a /etc/ocfs2/cluster.conf similar
to the following. This process needs to be completed on both Oracle RAC nodes in
the cluster and the OCFS2 configuration file should be exactly the same
for all of the nodes:
node:
ip_port = 7777
ip_address = 192.168.2.100
number = 0
name = linux1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.2.101
number = 1
name = linux2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
O2CB Cluster Service
Before we can do anything with OCFS2 like formatting or
mounting the file system, we need to first have OCFS2's cluster stack,
O2CB, running (which it will be as a result of the
configuration process performed above). The stack includes
the following services:
- NM: Node Manager that keep track of all the nodes in the cluster.conf
- HB: Heart beat service that issues up/down notifications when nodes join or leave the cluster
- TCP: Handles communication between the nodes
- DLM: Distributed lock manager that keeps track of all locks, its owners and status
- CONFIGFS: User space driven configuration file system mounted at /config
- DLMFS: User space interface to the kernel space DLM
All of the above cluster services have been packaged in
the o2cb system service (/etc/init.d/o2cb).
Here is a short listing of some of the more useful commands and options
for the o2cb system service.
Note: The following commands are for
documentation purposes only and do not need to be run when installing and configuring OCFS2
for this article!
- /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold: 31
Network idle timeout: 30000
Network keepalive delay: 2000
Network reconnect delay: 2000
Checking O2CB heartbeat: Not active
- /etc/init.d/o2cb offline ocfs2
Stopping O2CB cluster ocfs2: OK
The above command will offline the cluster we created, ocfs2.
- /etc/init.d/o2cb unload
Unmounting ocfs2_dlmfs filesystem: OK
Unloading module "ocfs2_dlmfs": OK
Unmounting configfs filesystem: OK
Unloading module "configfs": OK
The above command will unload all OCFS2 modules.
- /etc/init.d/o2cb load
Loading module "configfs": OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading module "ocfs2_nodemanager": OK
Loading module "ocfs2_dlm": OK
Loading module "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Loads all OCFS2 modules.
- /etc/init.d/o2cb online ocfs2
Starting O2CB cluster ocfs2: OK
The above command will online the cluster we created, ocfs2.
Configure O2CB to Start on Boot and Adjust O2CB Heartbeat Threshold
You now need to configure the on-boot properties of the
OC2B driver so that the cluster stack services will start on each boot.
You will also be adjusting
the OCFS2 Heartbeat Threshold from its default setting of 31 to 61.
Perform the following on both Oracle RAC nodes in the cluster:
# /etc/init.d/o2cb offline ocfs2
# /etc/init.d/o2cb unload
# /etc/init.d/o2cb configure
Configuring the O2CB driver.
This will configure the on-boot properties of the O2CB driver.
The following questions will determine whether the driver is loaded on
boot. The current values will be shown in brackets ('[]'). Hitting
<ENTER> without typing an answer will keep that current value. Ctrl-C
will abort.
Load O2CB driver on boot (y/n) [n]: y
Cluster to start on boot (Enter "none" to clear) [ocfs2]: ocfs2
Specify heartbeat dead threshold (>=7) [31]: 61
Specify network idle timeout in ms (>=5000) [30000]: 30000
Specify network keepalive delay in ms (>=1000) [2000]: 2000
Specify network reconnect delay in ms (>=2000) [2000]: 2000
Writing O2CB configuration: OK
Loading module "configfs": OK
Mounting configfs filesystem at /sys/kernel/config: OK
Loading module "ocfs2_nodemanager": OK
Loading module "ocfs2_dlm": OK
Loading module "ocfs2_dlmfs": OK
Mounting ocfs2_dlmfs filesystem at /dlm: OK
Starting O2CB cluster ocfs2: OK
Format the OCFS2 Filesystem
Note: Unlike the other tasks in this section, creating the
OCFS2 file system should only be executed on one of nodes in the RAC cluster.
I will be executing all commands in this section from linux1 only.
We can now start to make use of the iSCSI volume we partitioned for OCFS2 in the section
"Create Partitions on iSCSI Volumes".
If the O2CB cluster is offline, start it. The format
operation needs the cluster to be online, as it needs to ensure that
the volume is not mounted on some other node in the cluster.
Earlier in this document, we created the directory /u02
under the section
Create Mount Point for OCFS2 / Clusterware
which will be used as the mount point for the OCFS2 cluster file system.
This section contains the
commands to create and mount the file system to be used for the Cluster
Manager.
Note that it is possible to create and mount the OCFS2
file system using either the GUI tool ocfs2console
or the command-line tool mkfs.ocfs2. From the ocfs2console
utility, use the menu [Tasks] - [Format].
The instructions below demonstrate how to create the OCFS2
file system using the command-line tool mkfs.ocfs2.
To create the file system, we can use the Oracle
executable mkfs.ocfs2. For the purpose
of this example, I run the following command only from
linux1 as the root user account using
the local SCSI device name mapped to the iSCSI volume
for crs /dev/iscsi/crs/part1. Also note that I specified
a label named "oracrsfiles" which will be
referred to when mounting or un-mounting the volume:
$ su -
# mkfs.ocfs2 -b 4K -C 32K -N 4 -L oracrsfiles /dev/iscsi/crs/part1
mkfs.ocfs2 1.2.7
Filesystem label=oracrsfiles
Block size=4096 (bits=12)
Cluster size=32768 (bits=15)
Volume size=2145943552 (65489 clusters) (523912 blocks)
3 cluster groups (tail covers 977 clusters, rest cover 32256 clusters)
Journal size=67108864
Initial number of node slots: 4
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 1 block(s)
Formatting Journals: done
Writing lost+found: done
mkfs.ocfs2 successful
Mount the OCFS2 Filesystem
Now that the file system is created, we can mount it.
Let's first do it using the command-line, then I'll show
how to include it in the /etc/fstab to have it mount
on each boot.
Note: Mounting the cluster file system will need to be performed
on both Oracle RAC nodes in the cluster as the root
user account using the OCFS2 label oracrsfiles!
First, here is how to manually mount the OCFS2 file system from the
command-line. Remember that this needs
to be performed as the root user account:
$ su -
# mount -t ocfs2 -o datavolume,nointr -L "oracrsfiles" /u02
If the mount was successful, you will simply
get your prompt back. We should, however, run the following
checks to ensure the file system is mounted correctly.
Use the mount command to ensure that
the new file system is really mounted. This should be performed
on both nodes in the RAC cluster:
# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
domo:Public on /domo type nfs (rw,addr=192.168.1.121)
configfs on /sys/kernel/config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
/dev/sde1 on /u02 type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
Please take note of the datavolume option I am using to mount
the new file system. Oracle database users must mount any volume
that will contain the Voting Disk file, Cluster Registry (OCR), Data files,
Redo logs, Archive logs and Control files with the datavolume mount
option so as to ensure that the Oracle processes open the files with the O_DIRECT
flag. The nointr option ensures that the I/O's are not interrupted by signals.
Any other type of volume, including an Oracle home (which I will not be using for this article),
should not be mounted with this mount option.
Why does it take so much time to mount the volume? It takes around 5
seconds for a volume to mount. It does so as
to let the heartbeat thread stabilize. In a later release, Oracle
plans to add support for a global heartbeat, which will make most
mounts instant.
Configure OCFS2 to Mount Automatically at Startup
Let's take a look at what you've have done so far. You installed the OCFS2
software packages which will be used to store the shared files needed by Cluster Manager.
After going through the install, you loaded the OCFS2 module into the kernel and
then formatted the clustered file system. Finally, you mounted the newly created
file system using the OCFS2 label "oracrsfiles".
This section walks through the steps responsible for mounting the new OCFS2 file system each time
the machine(s) are booted using its label.
Start by adding the following line to the /etc/fstab
file on both Oracle RAC nodes in the cluster:
LABEL=oracrsfiles /u02 ocfs2 _netdev,datavolume,nointr 0 0
Notice the "_netdev" option for mounting this file system.
The _netdev mount option is a must for OCFS2 volumes. This
mount option indicates that the volume is to be mounted after the
network is started and dismounted before the network is
shutdown.
Now, let's make sure that the ocfs2.ko
kernel module is being loaded and that the file system will be mounted
during the boot process.
If you have been following along with the examples in
this article, the actions to load the kernel module and mount the OCFS2
file system should already be enabled. However, you should still check
those options by running the following on both Oracle RAC nodes in the
cluster as the root user account:
$ su -
# chkconfig --list o2cb
o2cb 0:off 1:off 2:on 3:on 4:on 5:on 6:off
The flags that I have marked in bold should be set
to "on".
Check Permissions on New OCFS2 Filesystem
Use the ls command to check
ownership. The permissions should be set to 0775 with owner "oracle"
and group "oinstall".
The following tasks only need to be executed on one of nodes in the RAC cluster.
I will be executing all commands in this section from linux1 only.
Let's first check the permissions:
# ls -ld /u02
drwxr-xr-x 3 root root 4096 Jul 31 17:21 /u02
As you can see from the listing above, the oracle
user account (and the oinstall group) will not be able
to write to this directory. Let's fix that:
# chown oracle:oinstall /u02
# chmod 775 /u02
Let's now go back and re-check that the permissions are correct for
both Oracle RAC nodes in the cluster:
# ls -ld /u02
drwxrwxr-x 3 oracle oinstall 4096 Jul 31 17:21 /u02
Create Directory for Oracle Clusterware Files
The last mandatory task is to create the appropriate directory
on the new OCFS2 file system that will be used for the Oracle Clusterware
shared files. We will also modify the permissions of this new directory
to allow the "oracle" owner and group "oinstall" read/write
access.
The following tasks only need to be executed on one of nodes in the RAC cluster.
I will be executing all commands in this section from linux1 only.
# mkdir -p /u02/oradata/racdb
# chown -R oracle:oinstall /u02/oradata
# chmod -R 775 /u02/oradata
# ls -l /u02/oradata
total 4
drwxrwxr-x 2 oracle oinstall 4096 Jul 31 17:31 racdb
Reboot Both Nodes
Before starting the next section, this would be a good
place to reboot both of the nodes in the RAC cluster. When the
machines come up, ensure that the cluster stack services are being
loaded and the new OCFS2 file system is being mounted:
# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
configfs on /sys/kernel/config type configfs (rw)
ocfs2_dlmfs on /dlm type ocfs2_dlmfs (rw)
cartman:SHARE2 on /cartman type nfs (rw,addr=192.168.1.120)
/dev/sdc1 on /u02 type ocfs2 (rw,_netdev,datavolume,nointr,heartbeat=local)
If you modified the O2CB heartbeat threshold, you should verify that it is
set correctly::
# cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold
61
How to Determine OCFS2 Version
To determine which version of OCFS2 is running, use:
# cat /proc/fs/ocfs2/version
OCFS2 1.2.9 Wed Jan 21 21:32:59 EST 2009 (build 5e8325ec7f66b5189c65c7a8710fe8cb)
18.
Install & Configure Automatic Storage Management (ASMLib 2.0)
Most of the installation and configuration procedures in this section should be
performed on both of the Oracle RAC nodes in the cluster! Creating the ASM disks, however, will
only need to be performed on a single node within the cluster.
In this section, we will install and configure ASMLib 2.0 which will be used by
Automatic Storage Management (ASM). In this article, we will use ASM
as the shared file system and volume manager for
all Oracle physical database files (data, online redo logs, control files,
archived redo logs) as well as a Flash Recovery Area.
ASM was introduced in Oracle Database 10g Release 1 and is used to alleviate the DBA from
having to manage individual files and drives. ASM is built into the Oracle kernel and
provides the DBA with a way to manage thousands of disk drives 24x7 for both
single and clustered instances of Oracle. All of the files and directories to be used
for Oracle will be contained in a disk group. ASM automatically performs
load balancing in parallel across all available disk drives to prevent hot spots and
maximize performance, even with rapidly changing data usage patterns.
There are two different methods to configure ASM on Linux:
- ASM with ASMLib I/O: This method creates all
Oracle database files on raw block
devices managed by ASM using ASMLib
calls. RAW devices are not required
with this method as ASMLib works
with block devices.
- ASM with Standard Linux I/O: This method creates all
Oracle database files on raw character
devices managed by ASM using
standard Linux I/O system calls. You
will be required to create RAW devices
for all disk partitions used by ASM.
In this article, I will be using the "ASM with ASMLib I/O" method. Oracle states
in Metalink Note 275315.1 that "ASMLib was provided to enable ASM I/O to Linux
disks without the limitations of the standard UNIX I/O API". I plan on performing
several tests in the future to identify the performance gains in using ASMLib. Those
performance metrics and testing details are out of scope of this article and therefore
will not be discussed.
If you would like to learn more about Oracle ASMLib 2.0, visit
http://www.oracle.com/technology/tech/linux/asmlib/
Install ASMLib 2.0 Packages
In previous editions of this article, this would be the time where you would need to download
the ASMLib 2.0 software from
Oracle ASMLib Downloads for Red Hat Enterprise Linux Server 5.
This is no longer necessary since the ASMLib software is included with Oracle Enterprise Linux
(with the exception of the Userspace Library which is a separate download).
The ASMLib 2.0
software stack includes the following packages:
32-bit (x86) Installations
- ASMLib Kernel Driver
- oracleasm-x.x.x-x.el5-x.x.x-x.el5.i686.rpm - (for default kernel)
- oracleasm-x.x.x-x.el5PAE-x.x.x-x.el5.i686.rpm - (for PAE kernel)
- oracleasm-x.x.x-x.el5xen-x.x.x-x.el5.i686.rpm - (for xen kernel)
- Userspace Library
- oracleasmlib-x.x.x-x.el5.i386.rpm
- Driver Support Files
- oracleasm-support-x.x.x-x.el5.i386.rpm
64-bit (x86_64) Installations
- ASMLib Kernel Driver
- oracleasm-x.x.x-x.el5-x.x.x-x.el5.x86_64.rpm - (for default kernel)
- oracleasm-x.x.x-x.el5PAE-x.x.x-x.el5.x86_64.rpm - (for PAE kernel)
- oracleasm-x.x.x-x.el5xen-x.x.x-x.el5.x86_64.rpm - (for xen kernel)
- Userspace Library
- oracleasmlib-x.x.x-x.el5.x86_64.rpm
- Driver Support Files
- oracleasm-support-x.x.x-x.el5.x86_64.rpm
With Oracle Enterprise Linux 5.3, the ASMLib 2.0 software packages do not get
installed by default. The ASMLib 2.0 Kernel Drivers and Driver Support File
can be found on CD #3. The Userspace Library will need to be downloaded
as it is not included with Oracle Enterprise Linux. To determine if
the Oracle ASMLib packages are installed (which in most cases, they will not be),
perform the following on both Oracle RAC nodes:
# rpm -qa | grep oracleasm | sort
If the Oracle ASMLib 2.0 packages are not installed, load the Oracle Enterprise Linux CD #3
into each of the Oracle RAC nodes and perform the following:
From Oracle Enterprise Linux 5 - [CD #3]
# mount -r /dev/cdrom /media/cdrom
# cd /media/cdrom/Server
# rpm -Uvh oracleasm-support-2.1.2-1.el5.i386.rpm
# rpm -Uvh oracleasm-2.6.18-128.el5-2.0.5-1.el5.i686.rpm
# cd /
# eject
After installing the ASMLib packages, verify from both Oracle
RAC nodes that the software is installed:
# rpm -qa | grep oracleasm | sort
oracleasm-2.6.18-128.el5-2.0.5-1.el5
oracleasm-support-2.1.2-1.el5
Getting Oracle ASMLib
As mentioned in the previous section, the ASMLib 2.0 software is included with Oracle Enterprise Linux
with the exception of the Userspace Library (a.k.a. the ASMLib support library). The Userspace Library
is required and can be downloaded for free at:
After downloading the Userspace Library to both Oracle RAC nodes in the cluster, install it using
the following:
# rpm -Uvh oracleasmlib-2.0.4-1.el5.i386.rpm
Preparing... ########################################### [100%]
1:oracleasmlib ########################################### [100%]
For information on obtaining the ASMLib support library through the Unbreakable Linux Network
(which is not a requirement for this article), please
visit Getting Oracle ASMLib via the Unbreakable Linux Network.
Configuring and Loading the ASMLib 2.0 Packages
Now that you have installed the ASMLib
Packages for Linux, you need to configure and load the ASM kernel
module. This task needs to be run on both Oracle RAC nodes
as the root user account:
$ su -
# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.
Default user to own the driver interface []: oracle
Default group to own the driver interface []: oinstall
Start Oracle ASM library driver on boot (y/n) [n]: y
Scan for Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: done
Initializing the Oracle ASMLib driver: [ OK ]
Scanning the system for Oracle ASMLib disks: [ OK ]
Create ASM Disks for Oracle
Creating the ASM disks only needs to be performed from one node in the
RAC cluster as the root user account. I will be running these commands on
linux1. On the other Oracle RAC node, you will need to
perform a scandisk to recognize the new volumes. When
that is complete, you should then run the
oracleasm listdisks command on both Oracle RAC nodes
to verify that all ASM disks were created and
available.
In the section
"Create Partitions on iSCSI Volumes",
we configured (partitioned) four iSCSI volumes to be used by ASM. ASM will be used
for storing Oracle database files like online redo logs, database files, control files,
archived redo log files, and the flash recovery area. Use the local device names that were created by
udev when configuring the four ASM volumns.
Note: If you are repeating this article using the same hardware (actually, the same shared logical drives),
you may get a failure when attempting to create the ASM disks. If you
do receive a failure, try listing all ASM disks that were used by the previous install using:
# /etc/init.d/oracleasm listdisks VOL1 VOL2 VOL3 VOL4
As you can see, the results show that I have four ASM volumes
already defined. If you have the four volumes already defined
from a previous run, go ahead and remove them using the
following commands. After removing the previously created volumes, use the "oracleasm createdisk" commands (below)
to create the new volumes.
# /etc/init.d/oracleasm deletedisk VOL1
Removing ASM disk "VOL1" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL2
Removing ASM disk "VOL2" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL3
Removing ASM disk "VOL3" [ OK ]
# /etc/init.d/oracleasm deletedisk VOL4
Removing ASM disk "VOL4" [ OK ]
To create the ASM disks using the iSCSI target names to local device name
mappings, type the following:
$ su -
# /etc/init.d/oracleasm createdisk VOL1 /dev/iscsi/asm1/part1
Marking disk "/dev/iscsi/asm1/part1" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL2 /dev/iscsi/asm2/part1
Marking disk "/dev/iscsi/asm2/part1" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL3 /dev/iscsi/asm3/part1
Marking disk "/dev/iscsi/asm3/part1" as an ASM disk [ OK ]
# /etc/init.d/oracleasm createdisk VOL4 /dev/iscsi/asm4/part1
Marking disk "/dev/iscsi/asm4/part1" as an ASM disk [ OK ]
On all other nodes in the RAC cluster, you must perform
a scandisk to recognize the new volumes:
# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [ OK ]
We can now test that the ASM disks were successfully created by
using the following command on both nodes in the RAC cluster
as the root user account:
# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
19.
Download Oracle RAC 10g Software
The following download procedures only need to be performed on
one node in the cluster!
The next step is to download
and extract the required Oracle software packages from the
Oracle Technology Network (OTN):
Note: If you do not currently have an account with OTN, you
will need to create one. This is a FREE account!
Oracle offers a development and testing license free of charge. No support, however,
is provided and the license does not permit production use.
A full description of the license agreement is available on OTN.
32-bit (x86) Installations
http://www.oracle.com/technology/software/products/database/oracle10g/htdocs/10201linuxsoft.html
64-bit (x86_64) Installations
http://www.oracle.com/technology/software/products/database/oracle10g/htdocs/10201linx8664soft.html
You will be downloading and extracting the required
software from Oracle to only one of the Linux nodes in the
cluster—namely, linux1. You will
perform all Oracle software installs from this machine. The Oracle installer will copy
the required software packages to all other nodes in the RAC
configuration using the remote access method we
setup in
Section 15 (Configure RAC Nodes for Remote Access using SSH).
Login to the node that you will be performing all of the Oracle installations from (linux1)
as the "oracle" user account. In this example, you
will be downloading the required Oracle software to linux1
and saving them to /home/oracle/orainstall.
Downloading and Extracting the Software
First, download
the Oracle Clusterware Release 2 (10.2.0.1.0), Oracle Database 10g
Release 2 (10.2.0.1.0), and Oracle Database 10g
Companion CD Release 2 (10.2.0.1.0) software for either Linux x86 or Linux x86-64. All
downloads are available from the same page.
As the oracle user account,
extract the three packages you downloaded to a temporary directory. In
this example, we will use /home/oracle/orainstall.
Extract the Oracle Clusterware package as follows:
# su - oracle
$ mkdir -p /home/oracle/orainstall
$ cd /home/oracle/orainstall
$ unzip 10201_clusterware_linux32.zip
Then extract the Oracle Database Software:
$ cd /home/oracle/orainstall
$ unzip 10201_database_linux32.zip
Finally, extract the (optional) Oracle Companion CD Software:
$ cd /home/oracle/orainstall
$ unzip 10201_companion_linux32.zip
20.
Pre-Installation Tasks for Oracle Database 10g Release 2
Perform the following checks on both Oracle RAC nodes in the cluster!
Before installing the Oracle Clusterware and Oracle RAC software, it is
highly recommended to run the Cluster Verification Utility (CVU) to
verify the hardware and software configuration.
CVU is a command-line utility provided on the Oracle Clusterware installation media. It is
responsible for performing various system checks to assist you with
confirming the Oracle RAC nodes are properly
configured for Oracle Clusterware and Oracle Real Application Clusters
installation. The CVU only needs to be run
from the node you will be performing the Oracle installations from (linux1
in this article). Note that the CVU is also run automatically at the end of the Oracle Clusterware
installation as part of the Configuration Assistants process.
Prerequisites for Using Cluster Verification Utility
Install cvuqdisk RPM (Oracle Enterprise Linux and RHEL Users Only)
The first pre-requisite for running the CVU pertains to users running
Oracle Enterprise Linux, Red Hat Linux, and SuSE.
If you are using any of the above listed operating systems, then you must
download and install the package cvuqdisk
to both of the Oracle RAC nodes in the cluster.
This means you will need to install the cvuqdisk RPM
to both linux1 and linux2. Without cvuqdisk, CVU will be
unable to discover shared disks and you will receive the error message
"Package cvuqdisk not installed" when you run CVU.
The cvuqdisk RPM can be found on the Oracle Clusterware installation
media in the rpm directory. For the purpose of this article, the Oracle Clusterware
media was extracted to the /home/oracle/orainstall/clusterware
directory on linux1.
Note that before installing the cvuqdisk RPM, we need to set
an environment variable named CVUQDISK_GRP to point to the group that will own
the cvuqdisk utility. The default group is oinstall which is the group
we are using for the oracle UNIX user account in this article.
Locate and copy the cvuqdisk RPM from
linux1 to linux2 as the "oracle" user account:
$ ssh linux2 "mkdir -p /home/oracle/orainstall/clusterware/rpm"
$ scp /home/oracle/orainstall/clusterware/rpm/cvuqdisk-1.0.1-1.rpm linux2:/home/oracle/orainstall/clusterware/rpm
Perform the following steps as the "root" user account on both
Oracle RAC nodes to install the cvuqdisk RPM:
$ su -
# cd /home/oracle/orainstall/clusterware/rpm
# CVUQDISK_GRP=oinstall; export CVUQDISK_GRP
# rpm -iv cvuqdisk-1.0.1-1.rpm
Preparing packages for installation...
cvuqdisk-1.0.1-1
# ls -l /usr/sbin/cvuqdisk
-rwsr-x--- 1 root oinstall 4168 Jun 2 2005 /usr/sbin/cvuqdisk
Verify Remote Access / User Equivalence
The CVU should be run from linux1 the node we will be performing
all of the Oracle installations from. Before running CVU, login as the
oracle user account and verify remote access / user equivalence is configured
to all nodes in the cluster. When using the
secure shell
method, user equivalence
will need to be enabled for the terminal shell session before attempting to run the CVU.
To enable user equivalence for the current terminal shell session, perform the
following steps remembering to enter the pass
phrase for each key that you generated when prompted:
# su - oracle
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Enter passphrase for /home/oracle/.ssh/id_rsa: xxxxx
Identity added: /home/oracle/.ssh/id_rsa (/home/oracle/.ssh/id_rsa)
Verifying Oracle Clusterware Requirements with CVU
Once all prerequisites for running the CVU utility have been met, we can now
check that all pre-installation tasks for Oracle Clusterware
are completed by executing the following command as the "oracle" UNIX user account
from linux1:
$ cd /home/oracle/orainstall/clusterware/cluvfy
$ mkdir -p jdk14
$ unzip jrepack.zip -d jdk14
$ CV_HOME=/home/oracle/orainstall/clusterware/cluvfy; export CV_HOME
$ CV_JDKHOME=/home/oracle/orainstall/clusterware/cluvfy/jdk14; export CV_JDKHOME
$ ./runcluvfy.sh stage -pre crsinst -n linux1,linux2 -verbose
Review the CVU report. Note that there are several errors you may
ignore in this report.
If your system only has 1GB of RAM memory, you may receive an error during
the "Total memory" check:
Check: Total memory
Node Name Available Required Comment
------------ ------------------------ ------------------------ ----------
linux2 1009.65MB (1033880KB) 1GB (1048576KB) failed
linux1 1009.65MB (1033880KB) 1GB (1048576KB) failed
Result: Total memory check failed.
As you can see from the output above, the
requirement is for 1GB of memory (1048576 KB). Although your system may have 1GB
of memory installed in each of the Oracle RAC nodes, the Linux kernel
is calculating it to be 1033880 KB which comes out to be 14696 KB short.
This can be considered close enough and safe to continue with the installation.
As I mentioned earlier in this article, I highly recommend both Oracle RAC nodes
have 2GB of RAM memory or higher for performance reasons.
The first error is with regards
to finding a suitable set of interfaces for VIPs which can be safely ignored. This is a bug
documented in Metalink Note
338924.1:
Suitable interfaces for the private interconnect on subnet "192.168.2.0":
linux2 eth1:192.168.2.101
linux1 eth1:192.168.2.100
ERROR:
Could not find a suitable set of interfaces for VIPs.
Result: Node connectivity check failed.
As documented in the note, this error can be safely ignored.
The last set of errors that can be ignored deal with specific
RPM package versions that are not required with Oracle Enterprise Linux 5. For
example:
- compat-db-4.0.14-5
- compat-gcc-7.3-2.96.128
- compat-gcc-c++-7.3-2.96.128
- compat-libstdc++-7.3-2.96.128
- compat-libstdc++-devel-7.3-2.96.128
- compat-libstdc++-devel-7.3-2.96.128
While these specific packages are listed as missing
in the CVU report, please ensure that the correct versions
of the compat-* packages are installed on both of
the Oracle RAC nodes in the cluster. For example, in Oracle Enterprise Linux 4 Update 5,
these would be:
- compat-gcc-32-3.2.3-47.3
- compat-gcc-32-c++-3.2.3-47.3
- compat-libstdc++-33-3.2.3-47.3
Checking the Hardware and Operating System Setup with CVU
The next CVU check to run will verify the hardware and operating system setup.
Again, run the following as the "oracle" UNIX user account from linux1:
$ cd /home/oracle/orainstall/clusterware/cluvfy
$ ./runcluvfy.sh stage -post hwos -n linux1,linux2 -verbose
Review the CVU report. As with the previous check
(pre-installation tasks for CRS),
the check for finding a suitable set of interfaces for VIPs will
fail and can be safely ignored
Also note you may receive warnings in the
"Checking shared storage accessibility..." portion of the report:
Checking shared storage accessibility...
WARNING:
Unable to determine the sharedness of /dev/sde on nodes:
linux2,linux2,linux2,linux2,linux2,linux1,linux1,linux1,linux1,linux1
Shared storage check failed on nodes "linux2,linux1".
If this occurs, this too can be safely ignored.
While we know the disks are visible and shared from both of our Oracle RAC nodes in the
cluster, the check itself may fail. Several reasons for this have been
documented. The first came from Metalink indicating that
cluvfy currently does not work with devices other than SCSI
devices. This would include devices like EMC PowerPath and volume groups like
those from Openfiler. At the time of this writing, no workaround exists
other than to use manual methods for detecting shared devices. Another
reason for this error was documented by Bane Radulovic at Oracle.
His research shows that CVU calls smartclt on Linux, and the problem
is that smartclt does not return the serial number from our iSCSI devices.
For example, a check against /dev/sde shows:
# /usr/sbin/smartctl -i /dev/sde
smartctl version 5.33 [i686-redhat-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: Openfile Virtual disk Version: 0
Serial number:
Device type: disk
Local Time is: Mon Sep 3 02:02:53 2007 EDT
Device supports SMART and is Disabled
Temperature Warning Disabled or Not Supported
At the time of this writing, it is unknown if the Openfiler developers have
plans to fix this.
Page 1 Page
2 Page 3 |