|
1. Introduction
One of the most
efficient ways to become familiar with Oracle Real Application Clusters
(RAC) 10g technology is to have access to an actual
Oracle RAC 10g cluster. There's no better way to
understand its benefits—including fault tolerance,
security, load balancing, and scalability—than to
experience them directly.
Unfortunately, for many
shops, the price of the hardware required for a typical production RAC
configuration makes this goal impossible. A small two-node cluster can
cost from US$10,000 to well over US$20,000. That cost would not even
include the heart of a production RAC environment—typically
a storage area network—which can start at US$10,000.
For those who want to become familiar
with Oracle RAC 10g without a major cash outlay,
this guide provides a low-cost alternative to configuring an Oracle RAC
10g Release 2 system using commercial off-the-shelf
components and downloadable software at an estimated cost of US$1,200
to US$1,800. The system involved comprises a dual-node cluster (each
with a single processor) running Linux (CentOS 4.2) with a shared disk storage based on IEEE1394 (FireWire)
drive technology.
This guide will mark
the last in a series to make use of FireWire technology as the shared
storage medium in order to build an inexpensive Oracle
RAC 10g system. The most current published version takes advantage of iSCSI;
more specifically, it explains how to build a network storage server using Openfiler. Powered
by rPath Linux,
Openfiler is a free browser-based network storage management utility
that delivers file-based Network Attached Storage (NAS) and block-based
Storage Area Networking (SAN) in a single framework. Openfiler supports
CIFS, NFS, HTTP/DAV, FTP, however; I will only be making use of its
iSCSI capabilities to implement an inexpensive SAN for the shared
storage component required by OracleRAC 10g.
Please note, that this
is not the only way to build a low-cost Oracle RAC 10g system. I have
worked on other solutions that utilize an implementation based on SCSI
rather than FireWire for shared storage. In most cases, SCSI will cost
more than a FireWire solution where an inexpensive SCSI configuration
will consist of:
- SCSI Controller:Two SCSI controllers
priced from $20 (Adaptec AHA-2940UW) to $220 (Adaptec
39320A-R) each
- SCSI Enclosure: $70 - (Inclose
1 Bay 3.5" U320 SCSI Case)
- SCSI Hard Drive: $140 - (36GB
15K 68p U320 SCSI Hard Drive)
- SCSI Cables: Two SCSI cables priced at
$20 each - (3ft External HD68 to HD68 U320 Cable)
Keep in mind that some motherboards may already include
built-in SCSI controllers.
It is important to note
that this configuration should never
be run in a production environment and that it
is not supported by Oracle or any other
vendor. In a production environment, fibre
channel—the high-speed serial-transfer interface that can
connect
systems and storage devices in either point-to-point or switched
topologies—is the technology of choice. FireWire offers a
low-cost alternative to fibre channel for testing and development, but
it is not ready for production.
The Oracle9i
and Oracle 10g Release 1 guides used raw
partitions for storing files on shared storage, but here we will make
use of the Oracle Cluster File System Release 2 (OCFS2) and Oracle
Automatic Storage Management (ASM) feature. The two Linux servers will
be configured as follows:
| Oracle Database Files |
| RAC
Node Name |
Instance
Name |
Database
Name |
$ORACLE_BASE |
File
System / Volume Manager for DB Files |
| linux1 |
orcl1 |
orcl |
/u01/app/oracle |
ASM |
| linux2 |
orcl2 |
orcl |
/u01/app/oracle |
ASM |
| Oracle Clusterware Shared Files |
| File Type |
File Name |
Partition |
Mount Point |
File System |
| Oracle Cluster Registry |
/u02/oradata/orcl/OCRFile |
/dev/sda1 |
/u02/oradata/orcl |
OCFS2 |
| CRS Voting Disk |
/u02/oradata/orcl/CSSFile |
/dev/sda1 |
/u02/oradata/orcl |
OCFS2 |
Note that with Oracle
Database 10g Release 2 (10.2), Cluster Ready
Services, or CRS, is now called Oracle Clusterware.
Starting with Oracle Database 10g
Release 2 (10.2), Oracle Clusterware should be installed in a separate
Oracle Clusterware home directory which is non-release specific. This
is a change to the Optimal Flexible Architecture (OFA) rules. You
should not install Oracle Clusterware in a release-specific Oracle home
mount point, (/u01/app/oracle/product/10.2.0/...
for example), as succeeding versions of Oracle Clusterware will
overwrite the Oracle Clusterware installation in the same path. Also,
If Oracle Clusterware 10g Release 2 (10.2) detects
an existing Oracle Cluster Ready Services installation, then it
overwrites the existing installation in the same path.
The Oracle Clusterware software will
be installed to /u01/app/oracle/product/crs on
each of the nodes that make up the RAC cluster. However, the
Clusterware software requires that two of its files—the
Oracle Cluster Registry (OCR) file and the Voting Disk
file—be shared with all nodes in the cluster. These two files
will be installed on shared storage using OCFS2. It is possible (but
not recommended by Oracle) to use RAW devices for these files; however,
it is not possible to use ASM for these two Clusterware files.
The Oracle Database 10g
Release 2 software will be installed into a separate Oracle Home,
namely /u01/app/oracle/product/10.2.0/db_1, on
each of the nodes that make up the RAC cluster. All the Oracle physical
database files (data, online redo logs, control files, archived redo
logs), will be installed to different partitions of the shared drive
being managed by ASM. (The Oracle database files can just as easily be
stored on OCFS2. Using ASM, however, makes the article that much more
interesting!)
Note: This article is only designed
to work as documented with absolutely no substitutions. If you are
looking for an example that takes advantage of Oracle RAC 10g
Release 1 with RHEL 3, click here.
If you are looking for a version that takes advantage of RHEL 4 with iSCSI and OpenFiler for shared storage, click here.
For an Oracle RAC 11g/iSCSI version of this guide, click here.
2.
Oracle RAC 10g Overview
Oracle RAC, introduced
with Oracle9i, is the successor to Oracle Parallel
Server (OPS). RAC allows multiple instances to access the same database
(storage) simultaneously. It provides fault tolerance, load balancing,
and performance benefits by allowing the system to scale out, and at
the same time—because all nodes access the same
database—the failure of one instance will not cause the loss
of
access to the database.
At the heart of Oracle RAC is a
shared disk subsystem. All nodes in the cluster must be able to access
all of the data, redo log files, control files and parameter files for
all nodes in the cluster. The data disks must be globally available to
allow all nodes to access the database. Each node has its own redo log
and control files but the other nodes must be able to access them in
order to recover that node in the event of a system failure.
One of the bigger differences between
Oracle RAC and OPS is the presence of Cache Fusion technology. In OPS,
a request for data between nodes required the data to be written to
disk first, and then the requesting node could read that data. With
cache fusion, data is passed along a high-speed interconnect using a
sophisticated locking algorithm.
Not all clustering solutions use
shared storage. Some vendors use an approach known as a federated
cluster, in which data is spread across several machines
rather than shared by all. With Oracle RAC 10g,
however, multiple nodes use the same set of disks for storing data.
With Oracle RAC, the data files, redo log files, control files, and
archived log files reside on shared storage on raw-disk devices, a NAS,
a SAN, ASM, or on a clustered file system. Oracle's approach to
clustering leverages the collective processing power of all the nodes
in the cluster and at the same time provides failover security.
Pre-configured Oracle10g
RAC solutions are available from vendors such as Dell, IBM and HP for
production environments. This article, however, focuses on putting
together your own Oracle10g RAC environment for
development and testing by using Linux servers and a low cost shared
disk solution; FireWire.
For more background about Oracle RAC,
visit the Oracle RAC Product
Center on OTN.
3.
Shared-Storage Overview
Fibre Channel is one of
the most popular solutions for shared storage. As I mentioned
previously, Fibre Channel is a high-speed serial-transfer interface
used to connect systems and storage devices in either point-to-point or
switched topologies. Protocols supported by Fibre Channel include SCSI
and IP.
Fibre Channel configurations can
support as many as 127 nodes and have a throughput of up to 2.12
gigabits per second. Fibre Channel, however, is very expensive; the
switch alone can start at US$1,000 and high-end drives can reach prices
of US$300. Overall, a typical Fibre Channel setup (including cards for
the servers) costs roughly US$10,000.
A less expensive alternative to Fibre
Channel is SCSI. SCSI technology provides acceptable performance for
shared storage, but for administrators and developers who are used to
GPL-based Linux prices, even SCSI can come in over budget at around
US$2,000 to US$5,000 for a two-node cluster.
Another popular solution is the Sun
NFS (Network File System) found on a NAS. It can be used for shared
storage but only if you are using a network appliance or something
similar. Specifically, you need servers that guarantee direct I/O over
NFS, TCP as the transport protocol, and read/write block sizes of 32K.
4.
FireWire Technology
Developed by Apple Computer and Texas Instruments, FireWire is a cross-platform
implementation of a high-speed serial data bus. With its high bandwidth, long
distances (up to 100 meters in length) and high-powered bus, FireWire is being
used in applications such as
digital video (DV), professional audio, hard drives, high-end digital still cameras
and home entertainment devices. Today, FireWire operates at transfer rates of up to
800 megabits per second while next generation FireWire calls for speeds to a theoretical
bit rate to 1600 Mbps and then up to a staggering 3200 Mbps. That's 3.2 gigabits per
second. This will make FireWire indispensable for transferring massive data files
and for even the most demanding video applications, such as working
with uncompressed high-definition (HD) video or multiple standard-definition (SD)
video streams.
The following chart shows speed comparisons of the various types of disk interfaces.
For each interface, I provide the maximum transfer rates in kilobits (kb), kilobytes (KB),
megabits (Mb), megabytes (MB), gigabits (Gb), and gigabytes (GB) per second.
As you can see, the capabilities of IEEE1394 compare very favorably with
other disk interface and network technologies that are currently available today.
| Disk Interface |
Speed |
| Kb |
KB |
Mb |
MB |
Gb |
GB |
| Serial |
115 |
14.375 |
0.115 |
0.014 |
|
|
| Parallel (standard) |
920 |
115 |
0.92 |
0.115 |
|
|
| 10Base-T Ethernet |
|
|
10 |
1.25 |
|
|
| IEEE 802.11b wireless Wi-Fi (2.4 GHz band) |
|
|
11 |
1.375 |
|
|
| USB 1.1 |
|
|
12 |
1.5 |
|
|
| Parallel (ECP/EPP) |
|
|
24 |
3 |
|
|
| SCSI-1 |
|
|
40 |
5 |
|
|
| IEEE 802.11g wireless WLAN (2.4 GHz band) |
|
|
54 |
6.75 |
|
|
| SCSI-2 (Fast SCSI / Fast Narrow SCSI) |
|
|
80 |
10 |
|
|
| 100Base-T Ethernet (Fast Ethernet) |
|
|
100 |
12.5 |
|
|
| ATA/100 (parallel) |
|
|
100 |
12.5 |
|
|
| IDE |
|
|
133.6 |
16.7 |
|
|
| Fast Wide SCSI (Wide SCSI) |
|
|
160 |
20 |
|
|
| Ultra SCSI (SCSI-3 / Fast-20 / Ultra Narrow) |
|
|
160 |
20 |
|
|
| Ultra IDE |
|
|
264 |
33 |
|
|
| Wide Ultra SCSI (Fast Wide 20) |
|
|
320 |
40 |
|
|
| Ultra2 SCSI |
|
|
320 |
40 |
|
|
| FireWire 400 - (IEEE1394a) |
|
|
400 |
50 |
|
|
| USB 2.0 |
|
|
480 |
60 |
|
|
| Wide Ultra2 SCSI |
|
|
640 |
80 |
|
|
| Ultra3 SCSI |
|
|
640 |
80 |
|
|
| FireWire 800 - (IEEE1394b) |
|
|
800 |
100 |
|
|
| Gigabit Ethernet |
|
|
1000 |
125 |
1 |
|
| PCI - (33 MHz / 32-bit) |
|
|
1064 |
133 |
1.064 |
|
| Serial ATA I - (SATA I) |
|
|
1200 |
150 |
1.2 |
|
| Wide Ultra3 SCSI |
|
|
1280 |
160 |
1.28 |
|
| Ultra160 SCSI |
|
|
1280 |
160 |
1.28 |
|
| PCI - (33 MHz / 64-bit) |
|
|
2128 |
266 |
2.128 |
|
| PCI - (66 MHz / 32-bit) |
|
|
2128 |
266 |
2.128 |
|
| AGP 1x - (66 MHz / 32-bit) |
|
|
2128 |
266 |
2.128 |
|
| Serial ATA II - (SATA II) |
|
|
2400 |
300 |
2.4 |
|
| Ultra320 SCSI |
|
|
2560 |
320 |
2.56 |
|
| FC-AL Fibre Channel |
|
|
3200 |
400 |
3.2 |
|
| PCI-Express x1 - (bidirectional) |
|
|
4000 |
500 |
4 |
|
| PCI - (66 MHz / 64-bit) |
|
|
4256 |
532 |
4.256 |
|
| AGP 2x - (133 MHz / 32-bit) |
|
|
4264 |
533 |
4.264 |
|
| Serial ATA III - (SATA III) |
|
|
4800 |
600 |
4.8 |
|
| PCI-X - (100 MHz / 64-bit) |
|
|
6400 |
800 |
6.4 |
|
| PCI-X - (133 MHz / 64-bit) |
|
|
|
1064 |
8.512 |
1 |
| AGP 4x - (266 MHz / 32-bit) |
|
|
|
1066 |
8.528 |
1 |
| 10G Ethernet - (IEEE 802.3ae) |
|
|
|
1250 |
10 |
1.25 |
| PCI-Express x4 - (bidirectional) |
|
|
|
2000 |
16 |
2 |
| AGP 8x - (533 MHz / 32-bit) |
|
|
|
2133 |
17.064 |
2.1 |
| PCI-Express x8 - (bidirectional) |
|
|
|
4000 |
32 |
4 |
| PCI-Express x16 - (bidirectional) |
|
|
|
8000 |
64 |
8 |
5.
Hardware & Costs
The hardware used to build our
example Oracle10g RAC environment consists of two
Linux servers and components that can be purchased at any local
computer store or over the Internet.
| Server
1 - (linux1) |
Dimension 2400 Series
Intel Pentium 4 Processor at
2.80GHz
1GB DDR SDRAM (at 333MHz)
40GB 7200 RPM Internal Hard
Drive
Integrated Intel 3D AGP Graphics
Integrated 10/100 Ethernet
CDROM (48X Max Variable)
3.5" Floppy
No monitor (Already had one)
USB Mouse and Keyboard
|
US$620 |
| 1 - Ethernet LAN Cards
| |
Each Linux server should
contain two NIC adapters. The Dell Dimension includes an integrated
10/100 Ethernet adapter that will be used to connect to the public
network. The second NIC adapter will be used for the private
interconnect. |
|
US$20 |
| 1 - FireWire Card
| |
The following is a list
of FireWire I/O cards that contain the correct chipset, allow for
multiple logins, and should work with this article (no guarantees
however): |
FireWire 400
FireWire 800
|
Warning:
I was unable to obtain concurrent logins to the FireWire drive when
using the LaCie FireWire 800 PCI Card - (107755) in both nodes. To
resolve this, I used two different FireWire PCI cards: the LaCie
FireWire 800 PCI Card - (107755) in one node and the SIIG FireWire 800
PCI-32T host adapter - (NN-830112) in the second node. Please refer to
the section Troubleshooting
Concurrent Logins to the FireWire Drive for further details. |
|
FireWire I/O cards with
chipsets made by Texas Instruments (TI) or VIA Technologies (VIA) are
known to work. |
|
US$30 |
| Server
2 - (linux2) |
Dimension 2400 Series
Intel Pentium 4 Processor at
2.80GHz
1GB DDR SDRAM (at 333MHz)
40GB 7200 RPM Internal Hard
Drive
Integrated Intel 3D AGP
Graphics
Integrated 10/100 Ethernet
CDROM (48X Max Variable)
3.5" Floppy
No monitor (already had one)
USB Mouse and Keyboard
|
US$620 |
| 1 - Ethernet LAN Cards
| |
Each Linux server should
contain two NIC adapters. The Dell Dimension includes an integrated
10/100 Ethernet adapter that will be used to connect to the public
network. The second NIC adapter will be used for the private
interconnect. |
|
US$20 |
| 1 - FireWire Card
| |
The following is a list
of FireWire I/O cards that contain the correct chipset, allow for
multiple logins, and should work with this article (no guarantees
however): |
FireWire 400
FireWire 800
|
Warning:
I was unable to obtain concurrent logins to the FireWire drive when
using the LaCie FireWire 800 PCI Card - (107755) in both nodes. To
resolve this, I used two different FireWire PCI cards: the LaCie
FireWire 800 PCI Card - (107755) in one node and the SIIG FireWire 800
PCI-32T host adapter - (NN-830112) in the second node. Please refer to
the section Troubleshooting
Concurrent Logins to the FireWire Drive for further details. |
|
FireWire I/O cards with
chipsets made by Texas Instruments (TI) or VIA Technologies (VIA) are
known to work. |
|
US$30 |
| Miscellaneous Components |
FireWire Hard Drive
| |
The following is a list
of FireWire drives (and enclosures) that contain the correct chipset,
allow for multiple logins, and should work with this article (no
guarantees however): |
FireWire 400
FireWire 800
|
|
Ensure that the FireWire
drive that you purchase supports multiple logins. If the drive has a
chipset that does not allow for concurrent access from more than one
server, the disk and its partitions can only be seen by one server at a
time. Disks with the Oxford 911 chipset (FireWire 400), Oxford 912
chipset (FireWire 800), or Oxford 922 chipset (FireWire 800) are known
to work. Note that the Oxford 912 chipset is newer and faster than
Oxford 922. Here are the details about the disk that I purchased for
this test:
Vendor: Maxtor
Model: OneTouch II
Mfg. Part No. or KIT No.: E01G300
Capacity: 300 GB
Cache Buffer: 16 MB
Spin Rate: 7200 RPM
Interface Transfer Rate: 400 Mbits/s
"Combo" Interface: IEEE 1394 / USB 2.0 and USB 1.1 compatible |
|
US$280 |
| 1 - Extra FireWire Cable
Each node in the RAC configuration will need to connect to the shared
storage device (the FireWire hard drive). The FireWire hard drive will
come supplied with one FireWire cable. You will need to purchase one
additional FireWire cable to connect the second node to the shared
storage. Select the appropriate FireWire cable that is compatible with
the data transmission speed (FireWire 400 / FireWire 800) and the
desired cable length.
FireWire 400
FireWire 800
|
US$20 |
| 1 - Ethernet hub or switch
Used for the interconnect between linux1-priv and linux2-priv. A
question I often receive is about substituting the Ethernet switch
(used for interconnect linux1-priv / linux2-priv) with a crossover CAT5
cable. I would not recommend this. I have found that when using a
crossover CAT5 cable for the interconnect, whenever I took one of the
PCs down, the other PC would detect a "cable unplugged" error, and thus
the Cache Fusion network would become unavailable.
|
US$25 |
| 4 - Network Cables
|
US$5
US$5
US$5
US$5 |
| Total |
US$1,685 |
Note that the Maxtor OneTouch
external drive does have two IEEE1394
(FireWire) ports, although it may not appear so at first glance. This
is also true for the other external hard drives I have listed above.
Now that we know the hardware that
will be used in this example, let's take a conceptual look at what the
environment looks like:

Figure 1 Architecture
As we start to go into the details of
the installation, keep in mind
that most tasks will need to be performed on both servers. I will
indicate at the beginning of each section whether or not the task(s)
should be performed on both nodes or not.
6.
Install the Linux Operating System
This section provides a
summary of the screens used to install the Linux operating system. This
guide is designed to work with the Red Hat Enterprise Linux 4 AS/ES
(RHEL4) operating environment. As an alternative, and what I used for
this article, is CentOS 4.2: a free and stable version of the RHEL4
operating environment.
For more detailed installation
instructions, it is possible to use the manuals
from Red Hat Linux. I would suggest, however, that the instructions I
have provided below be used for this configuration.
Before installing the Linux operating
system on both nodes, you should have the FireWire and two NIC
interfaces (cards) installed.
Also, before starting the
installation, ensure that the FireWire drive (our shared storage drive)
is NOT connected to either of the two servers. You
may also choose to connect both servers to the FireWire drive and
simply turn the power off to the drive. Although none of this is
mandatory, it is how I will be performing the installation and
configuration for this article.
Download the following ISO images for
CentOS 4.2:
After downloading
and burning the CentOS images (ISO files) to CD, insert CentOS Disk #1
into the first server (linux1 in this example),
power it on, and answer the installation screen prompts as noted below.
After completing the Linux installation on the first node, perform the
same Linux installation on the second node while substituting the node
name linux1 for linux2 and
the different IP addresses where appropriate.
Boot Screen
The first screen is the CentOS Enterprise Linux boot screen. At the
boot: prompt, hit [Enter] to start the installation process.
Media Test
When asked to test the CD media, tab over to [Skip] and hit [Enter]. If
there were any errors, the media burning software would have warned us.
After several seconds, the installer should then detect the video card,
monitor, and mouse. The installer then goes into GUI mode.
Welcome to
CentOS Enterprise Linux
At the welcome screen, click [Next] to continue.
Language /
Keyboard Selection
The next two screens prompt you for the Language and Keyboard settings.
Make the appropriate selections for your configuration.
Installation
Type
Choose the [Custom] option and click [Next] to continue.
Disk
Partitioning Setup
Select [Automatically partition] and click [Next] continue.
If there were a previous installation
of Linux on this machine, the next screen will ask if you want to
"remove" or "keep" old partitions. Select the option to [Remove all
partitions on this system]. Also, ensure that the [hda] drive is
selected for this installation. I also keep the checkbox [Review (and
modify if needed) the partitions created] selected. Click [Next] to
continue.
You will then be prompted with a
dialog window asking if you really want to remove all partitions. Click
[Yes] to acknowledge this warning.
Partitioning
The installer will then allow you to view (and modify if needed) the
disk partitions it automatically selected. In almost all cases, the
installer will choose 100MB for /boot, double the amount of RAM for
swap, and the rest going to the root (/) partition. I like to have a
minimum of 1GB for swap. For the purpose of this install, I will accept
all automatically preferred sizes. (Including 2GB for swap since I have
1GB of RAM installed.)
Starting with RHEL 4, the installer
will create the same disk configuration as just noted but will create
them using the Logical Volume Manager (LVM). For example, it will
partition the first hard drive (/dev/hda for my configuration) into two
partitions—one for the /boot partition (/dev/hda1) and the
remainder of the disk dedicate to a LVM named VolGroup00 (/dev/hda2).
The LVM Volume Group (VolGroup00) is then partitioned into two LVM
partitions - one for the root filesystem (/) and another for swap. I
basically check that it created at least 1GB of swap. Since I have 1GB
of RAM installed, the installer created 2GB of swap. Saying that, I
just accept the default disk layout.
Boot Loader
Configuration
The installer will use the GRUB boot loader by default. To use the GRUB
boot loader, accept all default values and click [Next] to continue.
Network
Configuration
I made sure to install both NIC interfaces (cards) in each of the Linux
machines before starting the operating system installation. This screen
should have successfully detected each of the network devices.
First, make sure that each of the
network devices are checked to [Active on boot]. The installer may
choose to not activate eth1.
Second, [Edit] both eth0 and eth1 as
follows. You may choose to use different IP addresses for both eth0 and
eth1 and that is OK. If possible, try to put eth1 (the interconnect) on
a different subnet than eth0 (the public network):
eth0:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.1.100
- Netmask: 255.255.255.0
eth1:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.2.100
- Netmask: 255.255.255.0
Continue by setting your hostname
manually. I used "linux1" for the first node and "linux2" for the
second one. Finish this dialog off by supplying your gateway and DNS
servers.
Firewall
On this screen, make sure to select [No firewall] and click [Next] to
continue. You may be prompted with a warning dialog about not setting
the firewall. If this occurs, simply hit [Proceed] to continue.
Additional
Language Support/Time Zone
The next two screens allow you to select additional language support
and time zone information. In almost all cases, you can accept the
defaults.
Set Root
Password
Select a root password and click [Next] to continue.
Package Group
Selection
Scroll down to the bottom of this screen and select [Everything] under
the "Miscellaneous" section. Click [Next] to continue.
Please note that the installation of
Oracle does not require
all Linux packages to be installed. My decision to install all packages
was for the sake of brevity. Please see section Section 15
("Check RPM Packages for Oracle 10g Release 2") for
a more detailed look at the critical packages required for a successful
Oracle installation.
Note that with some RHEL4
distributions, you will not get the "Package Group Selection" screen by
default. There, you are asked to simply "Install default software
packages" or "Customize software packages to be installed". Select the
option to "Customize software packages to be installed" and click
[Next] to continue. This will then bring up the "Package Group
Selection" screen. Now, scroll down to the bottom of this screen and
select [Everything] under the "Miscellaneous" section. Click [Next] to
continue.
About to Install
This screen is basically a confirmation screen. Click [Next] to start
the installation. During the installation process, you will be asked to
switch disks to Disk #2, Disk #3, and then Disk #4. Click [Continue] to
start the installation process.
Note that with CentOS 4.2, the
installer will ask to switch to Disk #2, Disk #3, Disk #4, Disk #1, and
then back to Disk #4.
Graphical
Interface (X) Configuration
With most RHEL4 distributions (not the case with CentOS 4.2), when the
installation is complete, the installer will attempt to detect your
video hardware. Ensure that the installer has detected and selected the
correct video hardware (graphics card and monitor) to properly use the
X Windows server. You will continue with the X configuration in the
next serveral screens.
Congratulations
And that's it. You have successfully installed CentOS Enterprise Linux
on the first node (linux1). The installer will eject the CD from the
CD-ROM drive. Take out the CD and click [Exit] to reboot the system.
When the system boots into Linux for
the first time, it will prompt you with another Welcome screen. The
following wizard allows you to configure the date and time, add any
additional users, testing the sound card, and to install any additional
CDs. The only screen I care about is the time and date (and if you are
using CentOS 4.x, the monitor/display settings). As for the others,
simply run through them as there is nothing additional that needs to be
installed (at this point anyways!). If everything was successful, you
should now be presented with the login screen.
Perform the
same installation on the second node
After completing the Linux installation on the first node, repeat the
above steps for the second node (linux2). When configuring the machine
name and networking, ensure to configure the proper values. For my
installation, this is what I configured for linux2:
First, make sure that each of the
network devices are checked to [Active on boot]. The installer will
choose not to activate eth1.
Second, [Edit] both eth0 and eth1 as
follows:
eth0:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.1.101
- Netmask: 255.255.255.0
eth1:
- Check off the option to [Configure using DHCP]
- Leave the [Activate on boot] checked
- IP Address: 192.168.2.101
- Netmask: 255.255.255.0
Continue by setting your hostname
manually. I used "linux2" for the second node. Finish this dialog off
by supplying your gateway and DNS servers.
7.
Network Configuration
Perform the following network
configuration on all nodes in the cluster!
Note:
Although we configured several of the network settings during the Linux
installation, it is important to not skip this
section as it contains critical steps that are required for the RAC
environment.
Introduction to
Network Settings
During the Linux O/S install you
already configured the IP address and host name for each of the nodes.
You now need to configure the /etc/hosts file as
well as adjust several of the network settings for the interconnect.
Each node should have one static IP
address for the public network and one static IP address for the
private cluster interconnect. The private interconnect should only be
used by Oracle to transfer Cluster Manager and Cache Fusion related
data. Although it is possible to use the public network for the
interconnect, this is not recommended as it may cause degraded database
performance (reducing the amount of bandwidth for Cache Fusion and
Cluster Manager traffic). For a production RAC implementation, the
interconnect should be at least gigabit or more and only be used by
Oracle.
Configuring Public and
Private Network
In our two-node example, you need to configure the network on both
nodes for access to the public network as well as their private
interconnect.
The easiest way to configure network
settings in RHEL4 is with the Network Configuration program. This
application can be started from the command-line as the root user
account as follows:
# su - # /usr/bin/system-config-network &
Do not use DHCP naming for the
public IP address or the interconnects; you need static IP addresses!
Using the Network Configuration
application, you need to configure both NIC devices as well as the /etc/hosts
file. Both of these tasks can be completed using the Network
Configuration GUI. Notice that the /etc/hosts
settings are the same for both nodes.
Our example
configuration will use the following settings:
| Server 1 (linux1) |
| Device |
IP Address |
Subnet |
Gateway |
Purpose |
| eth0 |
192.168.1.100 |
255.255.255.0 |
192.168.1.1 |
Connects linux1 to the public network |
| eth1 |
192.168.2.100 |
255.255.255.0 |
|
Connects linux1 (interconnect) to linux2 (linux2-priv) |
| /etc/hosts |
127.0.0.1 localhost loopback
# Public Network - (eth0) 192.168.1.100 linux1 192.168.1.101 linux2
# Private Interconnect - (eth1) 192.168.2.100 linux1-priv 192.168.2.101 linux2-priv
# Public Virtual IP (VIP) addresses for - (eth0) 192.168.1.200 linux1-vip 192.168.1.201 linux2-vip
|
| Server 2 (linux2) |
| Device |
IP Address |
Subnet |
Gateway |
Purpose |
| eth0 |
192.168.1.101 |
255.255.255.0 |
192.168.1.1 |
Connects linux2 to the public network |
| eth1 |
192.168.2.101 |
255.255.255.0 |
|
Connects linux2 (interconnect) to linux1 (linux1-priv) |
| /etc/hosts |
127.0.0.1 localhost loopback
# Public Network - (eth0) 192.168.1.100 linux1 192.168.1.101 linux2
# Private Interconnect - (eth1) 192.168.2.100 linux1-priv 192.168.2.101 linux2-priv
# Public Virtual IP (VIP) addresses for - (eth0) 192.168.1.200 linux1-vip 192.168.1.201 linux2-vip
|
Note that the virtual IP addresses
only need to be defined in the /etc/hosts file
(or your DNS) for both nodes. The public virtual IP addresses will be
configured automatically by Oracle when you run the Oracle Universal
Installer, which starts Oracle's Virtual Internet Protocol
Configuration Assistant (VIPCA). All virtual IP addresses will be
activated when the srvctl start nodeapps -n
<node_name> command is run. This is the Host
Name/IP Address that will be configured in the client(s) tnsnames.ora
file (more details later).
In the screenshots below, only node 1
(linux1) is shown. Be sure to make all the proper network settings to
both nodes.
Figure 2
Network Configuration Screen, Node 1 (linux1)
Figure 3
Ethernet Device Screen, eth0 (linux1)
Figure 4
Ethernet Device Screen, eth1 (linux1)
Figure 5:
Network Configuration Screen, /etc/hosts (linux1)
When the network if configured, you
can use the ifconfig command to verify everything
is working. The following example is from linux1:
$ /sbin/ifconfig -a eth0 Link encap:Ethernet HWaddr 00:0D:56:FC:39:EC inet addr:192.168.1.100 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::20d:56ff:fefc:39ec/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:835 errors:0 dropped:0 overruns:0 frame:0 TX packets:1983 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:705714 (689.1 KiB) TX bytes:176892 (172.7 KiB) Interrupt:3 eth1 Link encap:Ethernet HWaddr 00:0C:41:E8:05:37 inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::20c:41ff:fee8:537/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:9 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:546 (546.0 b) Interrupt:11 Base address:0xe400 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:5110 errors:0 dropped:0 overruns:0 frame:0 TX packets:5110 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8276758 (7.8 MiB) TX bytes:8276758 (7.8 MiB) sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
About Virtual IP
Why is there a Virtual IP (VIP) in 10g?
Why does it just return a dead connection when its primary node fails?
It's all about availability of the
application. When a node fails, the VIP associated with it is supposed
to be automatically failed over to some other node. When this occurs,
two things happen.
- The new node re-arps the world
indicating a new MAC address for the address. For directly connected
clients, this usually causes them to see errors on their connections to
the old address.
- Subsequent packets sent to the VIP
go to the new node, which will send error RST packets back to the
clients. This results in the clients getting errors immediately.
This means that when the client
issues SQL to the node that is now down, or traverses the address list
while connecting, rather than waiting on a very long TCP/IP time-out
(~10 minutes), the client receives a TCP reset. In the case of SQL,
this is ORA-3113. In the case of connect, the
next address in tnsnames is used.
Going one step further is making use
of Transparent
Application Failover (TAF). With TAF successfully configured, it is
possible to completely avoid ORA-3113 errors
alltogether! TAF will be discussed in more detail in Section
28 ("Transparent Application Failover - (TAF)").
Without using VIPs, clients connected
to a node that died will often wait a 10-minute TCP timeout period
before getting an error. As a result, you don't really have a good HA
solution without using VIPs (Source - Metalink Note 220970.1).
Confirm the RAC
Node Name is Not Listed in Loopback Address
Ensure that the node names (linux1
or linux2) are not included
for the loopback address in the /etc/hosts file.
If the machine name is listed in the in the loopback address entry as
below:
127.0.0.1 linux1 localhost.localdomain localhost
it will need to be removed as shown
below:
127.0.0.1 localhost.localdomain localhost
If the RAC node name is listed for
the loopback address, you will receive the following error during the
RAC installation:
ORA-00603: ORACLE server session terminated by fatal error
or
ORA-29702: error occurred in Cluster Group Service operation
Adjusting
Network Settings
With Oracle 9.2.0.1 and later, Oracle
makes use of UDP as the default protocol on Linux for inter-process
communication (IPC), such as Cache Fusion and Cluster Manager buffer
transfers between instances within the RAC cluster.
Oracle strongly suggests to adjust
the default and maximum send buffer size (SO_SNDBUF
socket option) to 256KB, and the default and maximum receive buffer
size (SO_RCVBUF socket option) to 256KB.
The receive buffers are used by TCP
and UDP to hold received data until it is read by the application. The
receive buffer cannot overflow because the peer is not allowed to send
data beyond the buffer size window. This means that datagrams will be
discarded if they don't fit in the socket receive buffer, potentially
causing the sender to overwhelm the receiver.
The default and maximum window size
can be changed in the /proc file system without
reboot:
# su - root # sysctl -w net.core.rmem_default=262144 net.core.rmem_default = 262144 # sysctl -w net.core.wmem_default=262144 net.core.wmem_default = 262144 # sysctl -w net.core.rmem_max=262144 net.core.rmem_max = 262144 # sysctl -w net.core.wmem_max=262144 net.core.wmem_max = 262144
The above commands made the changes
to the already running OS. You should now make the above changes
permanent (for each reboot) by adding the following lines to the /etc/sysctl.conf
file for each node in your RAC cluster:
# Default setting in bytes of the socket receive buffer net.core.rmem_default=262144
# Default setting in bytes of the socket send buffer net.core.wmem_default=262144
# Maximum socket receive buffer size which may be set by using # the SO_RCVBUF socket option net.core.rmem_max=262144
# Maximum socket send buffer size which may be set by using # the SO_SNDBUF socket option net.core.wmem_max=262144
Check and turn
off UDP ICMP rejections
During the Linux installation
process, I indicated to not configure the firewall option. By default
the option to configure a firewall is selected by the installer. This
has burned me several times so I like to do a double-check that the
firewall option is not configured and to ensure udp ICMP filtering is
turned off.
If UDP ICMP is blocked or rejected by
the firewall, the
Oracle Clusterware software will crash after several minutes of
running. When the Oracle Clusterware process fails, you will have
something similar to the following in the <machine_name>_evmocr.log
file:
08/29/2005 22:17:19 oac_init:2: Could not connect to server, clsc retcode = 9 08/29/2005 22:17:19 a_init:12!: Client init unsuccessful : [32] ibctx:1:ERROR: INVALID FORMAT proprinit:problem reading the bootblock or superbloc 22
When experiencing this type of error, the solution was to remove the
udp ICMP (iptables) rejection rule - or to simply have the firewall
option turned off. The Oracle Clusterware software will then start to
operate normally and not crash. The following commands should be
executed as the root user account:
- Check to ensure that the firewall option is turned off. If
the firewall option is stopped (like it is in my example below) you do
not have to proceed with the following steps.
# /etc/rc.d/init.d/iptables status Firewall is stopped.
- If the firewall option is operating you will need to first
manually disable UDP ICMP rejections:
# /etc/rc.d/init.d/iptables stop Flushing firewall rules: [ OK ] Setting chains to policy ACCEPT: filter [ OK ] Unloading iptables modules: [ OK ]
- Then, to turn UDP ICMP rejections off for next server
reboot (which should always be turned off):
# chkconfig iptables off
8.
Obtain & Install FireWire Modules
Perform the following
FireWire module install and configuration on all nodes
in the cluster!
The next step is to obtain and
install the FireWire modules that support the use of IEEE1394 devices
with multiple logins.
In previous versions of this guide,
it was required to download and install
both a new Linux kernel, (e.g. the OTN-supplied
2.6.9-11.0.0.10.3.EL #1 Linux kernel), and the supporting
FireWire modules. As of November 2005, oss.oracle.com
now provides pre-compiled FireWire modules for the 2.6.9-22.EL
and 2.6.9-22.0.1.EL Linux kernels. Installing a new
Linux kernel is no longer required. We will only need to install and
configure the supporting FireWire modules!
I am using the term "multiple logins"
a bit loosely in this article. The concept of "multiple logins" is
strictly not allowed in the IEEE1394 specification, as it is only a
point to point protocol. The term "multiple logins", is often confused
with "concurrent sessions", which is supported in the IEEE1394
specification. It simply means that the device allows multiple
outstanding requests simultaneously (similar to the SCSI protocol).
Therefore multiple hosts (initiators) on a single bus are prohibited
according to IEEE1394.
In a previous version of this guide,
I included the steps to download a patched version of the Linux kernel
(the C source code) and then compile it. Thanks to Oracle's Linux Projects
Development Team , this is no longer a requirement. Oracle
now provides a pre-compiled module that supports the sharing of
FireWire drives. The instructions for downloading and installing the
supporting FireWire modules are included in this section. Before going
into the details of how to perform these actions, however, let's take a
moment to discuss the changes that are required in the new FireWire
kernel driver.
While FireWire drivers already exist
for Linux, they often do not support shared
storage. Typically when you logon to an OS, the OS associates the
driver to a specific drive for that machine alone. This implementation
simply will not work for our RAC configuration. The shared storage (our
FireWire hard drive) needs to be accessed by more than one node. You
need to enable the FireWire driver to provide nonexclusive access to
the drive so that multiple servers — the nodes that comprise
the cluster — will be able to access the same storage. This
goal is accomplished by removing the bit mask that identifies the
machine during login in the source code, resulting in nonexclusive
access to the FireWire hard drive. All other nodes in the cluster login
to the same drive during their logon session, using the same modified
driver, so they too also have nonexclusive access to the drive.
Our implementation describes a dual
node cluster (each with a single processor), each server running CentOS
4.2 Enterprise Linux. Keep in mind that the process of installing the
supporting FireWire modules will need to be performed on both Linux
nodes. CentOS Enterprise Linux 4.2 includes kernel 2.6.9-22.EL #1.
Knowing this, we now need to download the matching FireWire module
from: Oracle
Technet Supplied FireWire Modules
Download
one of the following files for the supporting FireWire Modules:
oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm
- (for single processor)
or
oracle-firewire-modules-2.6.9-22.ELsmp-1286-1.i686.rpm
- (for multiple processors)
Install the
supporting FireWire modules, as root:
Install the supporting FireWire
modules package by running either of the following:
# rpm -ivh oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm - (for single processor) - OR - # rpm -ivh oracle-firewire-modules-2.6.9-22.ELsmp-1286-1.i686.rpm - (for multiple processors)
Add module
options:
Add the following lines to /etc/modprobe.conf:
options sbp2 exclusive_login=0
It is vital that the parameter sbp2
exclusive_login of the Serial Bus Protocol module (sbp2)
be set to zero to allow multiple hosts to login to and access the
FireWire disk concurrently.
Perform the
above tasks on the second Linux server:
With the supporting FireWire modules
installed on the first Linux server, move on to the second Linux server
and repeat the same tasks in this section on it.
Connect
FireWire drive to each machine and boot with the new FireWire modules
installed:
After performing the above tasks on
both nodes in the cluster, power down both Linux machines:
===============================
# hostname linux1 # init 0 =============================== # hostname linux2 # init 0 ===============================
After both machines are powered
down, connect each of them to the back of the FireWire drive. Power on
the FireWire drive. Finally, power on each Linux server one at a time
and ensure to watch for the "Probing for New Hardware" section during
the boot process.
Note:
RHEL 4 users will be prompted during the boot process on both nodes at
the "Probing for New Hardware" section for your FireWire hard drive.
Simply select the option to "Configure" the device and continue the
boot process.
If you are not prompted during the
"Probing for New Hardware"
section for the new FireWire drive, you will need to run the following
commands and reboot the machine. Do not put these commands in a script
and attempt to run them - run them interactively at the command-line:
# rpm -e oracle-firewire-modules-2.6.9-22.EL-1286-1 # rpm -Uvh oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm # modprobe -r sbp2 # modprobe -r sd_mod # modprobe -r ohci1394 # modprobe ohci1394 # modprobe sd_mod # modprobe sbp2 # /usr/sbin/kudzu # init 6
After running /usr/sbin/kudzu
(above), you should be prompted to
"Configure" the new drive. There are times when this didn't work the
first time. If it didn't work, I had to power down everything, power
them back up and perform the modprobe tasks
(above) again.
Check and turn
off UDP ICMP rejections:
After rebooting each machine (above)
check to ensure that the firewall option is turned off (stopped):
# /etc/rc.d/init.d/iptables status Firewall is stopped.
Loading the
FireWire stack:
In most cases, the loading of the
FireWire stack will already be configured in the /etc/rc.sysinit
file. The commands that are contained within this file that are
responsible for loading the FireWire stack are:
# modprobe sbp2 # modprobe ohci1394
In older versions of Red Hat, this
was not the case and these commands would have to be manually run or
put within a startup file. With Red Hat Enterprise Linux 3 and later,
these commands are already put within the /etc/rc.sysinit
file and run on each boot.
Check for SCSI
Device:
After each machine has rebooted, the
kernel should automatically detect the disk as a SCSI device (/dev/sdXX).
This section will provide several commands that should be run on all
nodes in the cluster to verify the FireWire drive was successfully
detected and being shared by all nodes in the cluster.
For this configuration, I was
performing the above procedures on both nodes at the same time. The
following commands and results are from my linux2
machine. Again, make sure that you run the following commands on all
nodes to ensure both machine can login to the shared drive.
Let's first check to see that the
FireWire adapter was successfully detected:
# lspci 00:00.0 Host bridge: Intel Corporation 82845G/GL[Brookdale-G]/GE/PE DRAM Controller/Host-Hub Interface (rev 01) 00:02.0 VGA compatible controller: Intel Corporation 82845G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 81) 00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 01) 00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01) 01:04.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11) 01:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link) 01:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev 01)
Second, let's check to see that the
modules are loaded:
# lsmod |egrep "ohci1394|sbp2|ieee1394|sd_mod|scsi_mod" sd_mod 17217 0 ohci1394 35784 0 sbp2 23948 0 scsi_mod 121293 2 sd_mod,sbp2 ieee1394 298228 2 ohci1394,sbp2
Third, let's make sure the disk was
detected and an entry was made by the kernel:
# cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: Maxtor Model: OneTouch II Rev: 023g Type: Direct-Access ANSI SCSI revision: 06
Now let's verify that the FireWire
drive is accessible for multiple logins and shows a valid login:
# dmesg | grep sbp2 sbp2: $Rev: 1265 $ Ben Collins <bcollins@debian.org> ieee1394: sbp2: Maximum concurrent logins supported: 2 ieee1394: sbp2: Number of active logins: 1 ieee1394: sbp2: Logged into SBP-2 device
From the above output, you can see
that the FireWire drive I have can support concurrent logins by up to 2
servers. It is vital that you have a drive where the chipset supports
concurrent access for all nodes within the RAC cluster.
One other test I like to perform is
to run a quick fdisk -l from each node in the
cluster to verify that it is really being picked up by the OS. Your
drive may show that the device does not contain a valid partition
table, but this is OK at this point of the RAC configuration.
# fdisk -l Disk /dev/hda: 40.0 GB, 40000000000 bytes 255 heads, 63 sectors/track, 4863 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/hda1 * 1 13 104391 83 Linux /dev/hda2 14 4863 38957625 8e Linux LVM Disk /dev/sda: 300.0 GB, 300090728448 bytes 255 heads, 63 sectors/track, 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 36483 293049666 c W95 FAT32 (LBA)
Rescan SCSI bus
no longer required:
In older versions of the kernel, I
would need to run the rescan-scsi-bus.sh script in order to detect the
FireWire drive. The purpose of this script was to create the SCSI entry
for the node by using the following command:
echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi
With RHEL3 and RHEL4, this step is
no longer required and the disk should be detected automatically.
Troubleshooting
SCSI Device Detection:
If you are having troubles with any
of the procedures (above) in detecting the SCSI device, you can try the
following:
# rpm -e oracle-firewire-modules-2.6.9-22.EL-1286-1 # rpm -Uvh oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm # modprobe -r sbp2 # modprobe -r sd_mod # modprobe -r ohci1394 # modprobe ohci1394 # modprobe sd_mod # modprobe sbp2 # /usr/sbin/kudzu
You may also want to unplug any USB
devices connected to the server. The system may not be able to
recognize your FireWire drive if you have a USB device attached!
Troubleshooting Concurrent
Logins to the FireWire Drive:
One of the first things to verify is that you are using a FireWire
drive that contains the correct chipset and allows for multiple logins.
If the FireWire drive has a chipset that does not allow for concurrent
access from more than one server, the disk and its partitions can only
be seen by one server at a time. Disks with the Oxford 911 chipset
(FireWire 400), Oxford 912 chipset (FireWire 800), or Oxford 922
chipset (FireWire 800) are known to work. Note that the Oxford 912
chipset is newer and faster than Oxford 922. For a full list of
FireWire drives (and enclosures) I have tested, please see the section Verified
and Tested FireWire Hard Drives.
Although I have only run into this
situation once, there can be
problems with the FireWire cards (the IEEE1394 controller cards). For
example, in one of my tests (using FireWire 800) I was unable to obtain
concurrent logins to the FireWire drive when using the LaCie FireWire
800 PCI Card - (107755) in both nodes. While the first node was able to
login to the FireWire drive, it was acquiring it for exclusive access
and causing the second node to fail its login process to the drive. For
example:
From linux1:
# dmesg | grep sbp2 sbp2: $Rev: 1265 $ Ben Collins ieee1394: sbp2: Maximum concurrent logins supported: 2 ieee1394: sbp2: Number of active logins: 0 ieee1394: sbp2: Logged into SBP-2 device
From linux2:
# dmesg | grep sbp2 sbp2: $Rev: 1265 $ Ben Collins ieee1394: sbp2: Maximum concurrent logins supported: 2 ieee1394: sbp2: Number of active logins: 1 ieee1394: sbp2: Error logging into SBP-2 device - login failed sbp2: probe of 00d04b690809290b-0 failed with error -16 ieee1394: sbp2: Maximum concurrent logins supported: 2 ieee1394: sbp2: Number of active logins: 1 ieee1394: sbp2: Error logging into SBP-2 device - login failed sbp2: probe of 00d04b690809290b-0 failed with error -16
I have seen postings that indicate
this can be resolved by using the
sbp2 option "serialize_io=1" defined in the the /etc/modprobe.conf.
For example, the entry in the /etc/modprobe.conf
file would be:
options sbp2 serialize_io=1 exclusive_login=0
Although this has been used to
resolve some of the cases with failed
concurrent logins, it did not resolve the problem I was having with
installing the LaCie FireWire 800 PCI Card - (107755) in both nodes.
Another solution to resolve failed
concurrent logins is to use
different FireWire cards for each of the nodes. For example, one that
uses the Texas Instruments (TI) chipset and another that users the VIA
Technologies (VIA) chipset. Actually for me, I was able to resolve this
by simply using two different FireWire cards from different vendors.
For example, I used the LaCie FireWire 800 PCI Card - (107755) in one
node and the SIIG FireWire 800 PCI-32T host adapter - (NN-830112) in
the second node. Although they both use the TI chipset, it was enough
to resolve the problem I was having with failed concurrent logins.
After this, both nodes were able to successfully login to the FireWire
drive.
9.
Create "oracle" User and Directories (both nodes)
Perform the following tasks
on all nodes in the cluster!
You will be using OCFS2 to store the
files required to be shared for the Oracle Clusterware software. When
using OCFS2, the UID of the UNIX user oracle and
GID of the UNIX group dba should be identical on
all machines in the cluster. If either the UID or GID are different,
the files on the OCFS file system may show up as "unowned" or may even
be owned by a different user. For this article, I will use 175 for the oracle
UID and 115 for the dba GID.
Create Group
and User for Oracle
Let's continue our example by
creating the Unix dba group and oracle
user account along with all appropriate directories.
# mkdir -p /u01/app # groupadd -g 115 dba # useradd -u 175 -g 115 -d /u01/app/oracle -s /bin/bash -c "Oracle Software Owner" -p oracle oracle # chown -R oracle:dba /u01 # passwd oracle # su - oracle
Note: When you
are setting the Oracle environment variables for each RAC node, ensure
to assign each RAC node a unique Oracle SID! For this example, I used:
- linux1
: ORACLE_SID=orcl1
- linux2
: ORACLE_SID=orcl2
After creating the "oracle"
UNIX userid on both nodes, ensure that the environment is setup
correctly by using the following .bash_profile:
.................................... # .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi alias ls="ls -FA" # User specific environment and startup programs export ORACLE_BASE=/u01/app/oracle export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1 export ORA_CRS_HOME=$ORACLE_BASE/product/crs export ORACLE_PATH=$ORACLE_BASE/common/oracle/sql:.:$ORACLE_HOME/rdbms/admin # Each RAC node must have a unique ORACLE_SID. (i.e. orcl1, orcl2,...) export ORACLE_SID=orcl1 export PATH=.:${PATH}:$HOME/bin:$ORACLE_HOME/bin export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin export PATH=${PATH}:$ORACLE_BASE/common/oracle/bin export ORACLE_TERM=xterm export TNS_ADMIN=$ORACLE_HOME/network/admin export ORA_NLS10=$ORACLE_HOME/nls/data export LD_LIBRARY_PATH=$ORACLE_HOME/lib export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib export CLASSPATH=$ORACLE_HOME/JRE export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib export THREADS_FLAG=native export TEMP=/tmp export TMPDIR=/tmp ....................................
Create Mount Point for OCFS2 /
Clusterware
Finally, create the
mount point for the OCFS2 filesystem that will be used to store the two
Oracle Clusterware shared files. These commands will need to be run as
the "root" user account:
$ su - # mkdir -p /u02/oradata/orcl # chown -R oracle:dba /u02
Ensure Adequate
temp Space for OUI
Note:
The Oracle Universal Installer (OUI) requires at most 400MB of free
space in the /tmp directory.
You can check the available space in /tmp
by running the following command:
# cat /proc/swaps Filename Type Size Used Priority /dev/mapper/VolGroup00-LogVol01 partition 2031608 0 -1
-OR-
# cat /proc/meminfo | grep SwapTotal SwapTotal: 2031608 kB
If for some reason you do not have
enough space in /tmp, you can temporarily create
space in another file system and point your TEMP
and TMPDIR to it for the duration of the install.
Here are the steps to do this:
# su - # mkdir /<AnotherFilesystem>/tmp # chown root.root /<AnotherFilesystem>/tmp # chmod 1777 /<AnotherFilesystem>/tmp # export TEMP=/<AnotherFilesystem>/tmp # used by Oracle # export TMPDIR=/<AnotherFilesystem>/tmp # used by Linux programs # like the linker "ld"
When the installation of Oracle is
complete, you can remove the temporary directory using the following:
# su - # rmdir /<AnotherFilesystem>/tmp # unset TEMP # unset TMPDIR
10.
Create Partitions on the Shared FireWire Storage Device
Create the
following partitions on only one node in
the cluster!
The next step is to create the
required partitions on the FireWire (shared) drive. As I mentioned
previously, you will use OCFS2 to store the two files to be shared for
Oracle's Clusterware software. You will then create three ASM volumes;
two for all physical database files (data/index files, online redo log
files, and control files) and one for the Flash Recovery Area (RMAN
backups and archived redo log files).
The following table lists the
individual partitions that will be created on the FireWire (shared)
drive and what files will be contained on them.
| Oracle
Shared Drive Configuration |
| File System Type |
Partition |
Size |
Mount Point |
ASM Diskgroup Name |
File Types |
| OCFS2 |
/dev/sda1 |
1GB |
/u02/oradata/orcl |
|
Oracle Cluster Registry File -
(~100MB)
CRS Voting Disk - (~20MB) |
| ASM |
/dev/sda2 |
50GB |
ORCL:VOL1 |
+ORCL_DATA1 |
Oracle Database Files |
| ASM |
/dev/sda3 |
50GB |
ORCL:VOL2 |
+ORCL_DATA1 |
Oracle Database Files |
| ASM |
/dev/sda4 |
100GB |
ORCL:VOL3 |
+FLASH_RECOVERY_AREA |
Oracle Flash Recovery Area |
| Total |
|
201GB |
|
|
|
Create All
Partitions on FireWire Shared Storage
As shown in the table above, my
FireWire drive shows up as the SCSI device /dev/sda.
The fdisk command is used in Linux for creating
(and removing) partitions. For this configuration, we will be creating
four partitions: one for Oracle's Clusterware shared files and the
other three for ASM (to store all Oracle database files and the Flash
Recovery Area). Before creating the new partitions, it is important to
remove any existing partitions (if they exist) on the FireWire drive:
# fdisk /dev/sda Command (m for help): p Disk /dev/sda: 300.0 GB, 300090728448 bytes 255 heads, 63 sectors/track, 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 36483 293049666 c W95 FAT32 (LBA) Command (m for help): d Selected partition 1 Command (m for help): p Disk /dev/sda: 300.0 GB, 300090728448 bytes 255 heads, 63 sectors/track, 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-36483, default 1): 1 Last cylinder or +size or +sizeM or +sizeK (1-36483, default 36483): +1G Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 2 First cylinder (124-36483, default 124): 124 Last cylinder or +size or +sizeM or +sizeK (124-36483, default 36483): +50G Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 3 First cylinder (6204-36483, default 6204): 6204 Last cylinder or +size or +sizeM or +sizeK (6204-36483, default 36483): +50G Command (m for help): n Command action e extended p primary partition (1-4) p Selected partition 4 First cylinder (12284-36483, default 12284): 12284 Last cylinder or +size or +sizeM or +sizeK (12284-36483, default 36483): +100G Command (m for help): p Disk /dev/sda: 300.0 GB, 300090728448 bytes 255 heads, 63 sectors/track, 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 123 987966 83 Linux /dev/sda2 124 6203 48837600 83 Linux /dev/sda3 6204 12283 48837600 83 Linux /dev/sda4 12284 24442 97667167+ 83 Linux Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks.
After creating all required
partitions, you should now inform the kernel of the partition changes
using the following syntax as the root user
account:
# partprobe # fdisk -l /dev/sda Disk /dev/sda: 300.0 GB, 300090728448 bytes 255 heads, 63 sectors/track, 36483 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sda1 1 123 987966 83 Linux /dev/sda2 124 6203 48837600 83 Linux /dev/sda3 6204 12283 48837600 83 Linux /dev/sda4 12284 24442 97667167+ 83 Linux
(Note: The FireWire drive and
partitions created will be exposed as a SCSI device.)
Page 1 Page 2 Page 3
|