DBA: Linux

  DOWNLOAD
Oracle Database 10g
  TAGS
linux, rac, clustering All
Build Your Own Oracle RAC 10g Release 2 Cluster on Linux and FireWire

by Jeffrey Hunter

Learn how to set up and configure an Oracle RAC 10g Release 2 development cluster for less than US$1,800.

For development and testing only; production deployments will not be supported!

Updated March 2007

Contents

  1. Introduction
  2. Oracle RAC 10g Overview
  3. Shared-Storage Overview
  4. FireWire Technology
  5. Hardware & Costs
  6. Install the Linux Operating System
  7. Network Configuration
  8. Obtain & Install FireWire Modules
  9. Create "oracle" User and Directories
  10. Create Partitions on the Shared FireWire Storage Device
  11. Configure the Linux Servers for Oracle
  12. Configure the hangcheck-timer Kernel Module
  13. Configure RAC Nodes for Remote Access
  14. All Startup Commands for Each RAC Node
  15. Check RPM Packages for Oracle 10g Release 2
  16. Install & Configure Oracle Cluster File System (OCFS2)
  17. Install & Configure Automatic Storage Management (ASMLib 2.0)
  18. Download Oracle 10g RAC Software
  19. Install Oracle 10g Clusterware Software
  20. Install Oracle 10g Database Software
  21. Install Oracle10g Companion CD Software
  22. Create TNS Listener Process
  23. Create the Oracle Cluster Database
  24. Verify TNS Networking Files
  25. Create / Alter Tablespaces
  26. Verify the RAC Cluster & Database Configuration
  27. Starting / Stopping the Cluster
  28. Transparent Application Failover - (TAF)
  29. Conclusion
  30. Acknowledgements

Downloads for this guide:
CentOS Enterprise Linux 4.2
Oracle Cluster File System Release 2 - (1.2.3-1) - Single Processor / SMP / Hugemem
Oracle Cluster File System Releaase 2 Tools - (1.2.1-1) - Tools / Console
Oracle Database 10g Release 2 EE, Clusterware, Companion CD - (10.2.0.1.0)
Precompiled RHEL4 FireWire Modules - (2.6.9-22.EL)
ASMLib 2.0 Driver - (2.6.9-22.EL / 2.0.3-1) - Single Processor / SMP / Hugemem
ASMLib 2.0 Library and Tools - (2.0.3-1) - Driver Support Files / Userspace Library


1. Introduction

One of the most efficient ways to become familiar with Oracle Real Application Clusters (RAC) 10g technology is to have access to an actual Oracle RAC 10g cluster. There's no better way to understand its benefits—including fault tolerance, security, load balancing, and scalability—than to experience them directly.

Unfortunately, for many shops, the price of the hardware required for a typical production RAC configuration makes this goal impossible. A small two-node cluster can cost from US$10,000 to well over US$20,000. That cost would not even include the heart of a production RAC environment—typically a storage area network—which can start at US$10,000.

For those who want to become familiar with Oracle RAC 10g without a major cash outlay, this guide provides a low-cost alternative to configuring an Oracle RAC 10g Release 2 system using commercial off-the-shelf components and downloadable software at an estimated cost of US$1,200 to US$1,800. The system involved comprises a dual-node cluster (each with a single processor) running Linux (CentOS 4.2) with a shared disk storage based on IEEE1394 (FireWire) drive technology.

This guide will mark the last in a series to make use of FireWire technology as the shared storage medium in order to build an inexpensive Oracle RAC 10g system. The most current published version takes advantage of iSCSI; more specifically, it explains how to build a network storage server using Openfiler. Powered by rPath Linux, Openfiler is a free browser-based network storage management utility that delivers file-based Network Attached Storage (NAS) and block-based Storage Area Networking (SAN) in a single framework. Openfiler supports CIFS, NFS, HTTP/DAV, FTP, however; I will only be making use of its iSCSI capabilities to implement an inexpensive SAN for the shared storage component required by OracleRAC 10g.

Please note, that this is not the only way to build a low-cost Oracle RAC 10g system. I have worked on other solutions that utilize an implementation based on SCSI rather than FireWire for shared storage. In most cases, SCSI will cost more than a FireWire solution where an inexpensive SCSI configuration will consist of:

  • SCSI Controller:Two SCSI controllers priced from $20 (Adaptec AHA-2940UW) to $220 (Adaptec 39320A-R) each
  • SCSI Enclosure: $70 - (Inclose 1 Bay 3.5" U320 SCSI Case)
  • SCSI Hard Drive: $140 - (36GB 15K 68p U320 SCSI Hard Drive)
  • SCSI Cables: Two SCSI cables priced at $20 each - (3ft External HD68 to HD68 U320 Cable)

Keep in mind that some motherboards may already include built-in SCSI controllers.

It is important to note that this configuration should never be run in a production environment and that it is not supported by Oracle or any other vendor. In a production environment, fibre channel—the high-speed serial-transfer interface that can connect systems and storage devices in either point-to-point or switched topologies—is the technology of choice. FireWire offers a low-cost alternative to fibre channel for testing and development, but it is not ready for production.

The Oracle9i and Oracle 10g Release 1 guides used raw partitions for storing files on shared storage, but here we will make use of the Oracle Cluster File System Release 2 (OCFS2) and Oracle Automatic Storage Management (ASM) feature. The two Linux servers will be configured as follows:

Oracle Database Files
RAC Node Name Instance Name Database Name $ORACLE_BASE File System / Volume Manager for DB Files
linux1 orcl1 orcl /u01/app/oracle ASM
linux2 orcl2 orcl /u01/app/oracle ASM
Oracle Clusterware Shared Files
File Type File Name Partition Mount Point File System
Oracle Cluster Registry /u02/oradata/orcl/OCRFile /dev/sda1 /u02/oradata/orcl OCFS2
CRS Voting Disk /u02/oradata/orcl/CSSFile /dev/sda1 /u02/oradata/orcl OCFS2

Note that with Oracle Database 10g Release 2 (10.2), Cluster Ready Services, or CRS, is now called Oracle Clusterware.

Starting with Oracle Database 10g Release 2 (10.2), Oracle Clusterware should be installed in a separate Oracle Clusterware home directory which is non-release specific. This is a change to the Optimal Flexible Architecture (OFA) rules. You should not install Oracle Clusterware in a release-specific Oracle home mount point, (/u01/app/oracle/product/10.2.0/... for example), as succeeding versions of Oracle Clusterware will overwrite the Oracle Clusterware installation in the same path. Also, If Oracle Clusterware 10g Release 2 (10.2) detects an existing Oracle Cluster Ready Services installation, then it overwrites the existing installation in the same path.

The Oracle Clusterware software will be installed to /u01/app/oracle/product/crs on each of the nodes that make up the RAC cluster. However, the Clusterware software requires that two of its files—the Oracle Cluster Registry (OCR) file and the Voting Disk file—be shared with all nodes in the cluster. These two files will be installed on shared storage using OCFS2. It is possible (but not recommended by Oracle) to use RAW devices for these files; however, it is not possible to use ASM for these two Clusterware files.

The Oracle Database 10g Release 2 software will be installed into a separate Oracle Home, namely /u01/app/oracle/product/10.2.0/db_1, on each of the nodes that make up the RAC cluster. All the Oracle physical database files (data, online redo logs, control files, archived redo logs), will be installed to different partitions of the shared drive being managed by ASM. (The Oracle database files can just as easily be stored on OCFS2. Using ASM, however, makes the article that much more interesting!)

Note: This article is only designed to work as documented with absolutely no substitutions. If you are looking for an example that takes advantage of Oracle RAC 10g Release 1 with RHEL 3, click here.

If you are looking for a version that takes advantage of RHEL 4 with iSCSI and OpenFiler for shared storage, click here.

For an Oracle RAC 11g/iSCSI version of this guide, click here.


2. Oracle RAC 10g Overview

Oracle RAC, introduced with Oracle9i, is the successor to Oracle Parallel Server (OPS). RAC allows multiple instances to access the same database (storage) simultaneously. It provides fault tolerance, load balancing, and performance benefits by allowing the system to scale out, and at the same time—because all nodes access the same database—the failure of one instance will not cause the loss of access to the database.

At the heart of Oracle RAC is a shared disk subsystem. All nodes in the cluster must be able to access all of the data, redo log files, control files and parameter files for all nodes in the cluster. The data disks must be globally available to allow all nodes to access the database. Each node has its own redo log and control files but the other nodes must be able to access them in order to recover that node in the event of a system failure.

One of the bigger differences between Oracle RAC and OPS is the presence of Cache Fusion technology. In OPS, a request for data between nodes required the data to be written to disk first, and then the requesting node could read that data. With cache fusion, data is passed along a high-speed interconnect using a sophisticated locking algorithm.

Not all clustering solutions use shared storage. Some vendors use an approach known as a federated cluster, in which data is spread across several machines rather than shared by all. With Oracle RAC 10g, however, multiple nodes use the same set of disks for storing data. With Oracle RAC, the data files, redo log files, control files, and archived log files reside on shared storage on raw-disk devices, a NAS, a SAN, ASM, or on a clustered file system. Oracle's approach to clustering leverages the collective processing power of all the nodes in the cluster and at the same time provides failover security.

Pre-configured Oracle10g RAC solutions are available from vendors such as Dell, IBM and HP for production environments. This article, however, focuses on putting together your own Oracle10g RAC environment for development and testing by using Linux servers and a low cost shared disk solution; FireWire.

For more background about Oracle RAC, visit the Oracle RAC Product Center on OTN.


3. Shared-Storage Overview

Fibre Channel is one of the most popular solutions for shared storage. As I mentioned previously, Fibre Channel is a high-speed serial-transfer interface used to connect systems and storage devices in either point-to-point or switched topologies. Protocols supported by Fibre Channel include SCSI and IP.

Fibre Channel configurations can support as many as 127 nodes and have a throughput of up to 2.12 gigabits per second. Fibre Channel, however, is very expensive; the switch alone can start at US$1,000 and high-end drives can reach prices of US$300. Overall, a typical Fibre Channel setup (including cards for the servers) costs roughly US$10,000.

A less expensive alternative to Fibre Channel is SCSI. SCSI technology provides acceptable performance for shared storage, but for administrators and developers who are used to GPL-based Linux prices, even SCSI can come in over budget at around US$2,000 to US$5,000 for a two-node cluster.

Another popular solution is the Sun NFS (Network File System) found on a NAS. It can be used for shared storage but only if you are using a network appliance or something similar. Specifically, you need servers that guarantee direct I/O over NFS, TCP as the transport protocol, and read/write block sizes of 32K.


4. FireWire Technology

Developed by Apple Computer and Texas Instruments, FireWire is a cross-platform implementation of a high-speed serial data bus. With its high bandwidth, long distances (up to 100 meters in length) and high-powered bus, FireWire is being used in applications such as digital video (DV), professional audio, hard drives, high-end digital still cameras and home entertainment devices. Today, FireWire operates at transfer rates of up to 800 megabits per second while next generation FireWire calls for speeds to a theoretical bit rate to 1600 Mbps and then up to a staggering 3200 Mbps. That's 3.2 gigabits per second. This will make FireWire indispensable for transferring massive data files and for even the most demanding video applications, such as working with uncompressed high-definition (HD) video or multiple standard-definition (SD) video streams.

The following chart shows speed comparisons of the various types of disk interfaces. For each interface, I provide the maximum transfer rates in kilobits (kb), kilobytes (KB), megabits (Mb), megabytes (MB), gigabits (Gb), and gigabytes (GB) per second. As you can see, the capabilities of IEEE1394 compare very favorably with other disk interface and network technologies that are currently available today.

Disk Interface Speed
Kb KB Mb MB Gb GB
Serial 115 14.375 0.115 0.014    
Parallel (standard) 920 115 0.92 0.115    
10Base-T Ethernet     10 1.25    
IEEE 802.11b wireless Wi-Fi (2.4 GHz band)     11 1.375    
USB 1.1     12 1.5    
Parallel (ECP/EPP)     24 3    
SCSI-1     40 5    
IEEE 802.11g wireless WLAN (2.4 GHz band)     54 6.75    
SCSI-2 (Fast SCSI / Fast Narrow SCSI)     80 10    
100Base-T Ethernet (Fast Ethernet)     100 12.5    
ATA/100 (parallel)     100 12.5    
IDE     133.6 16.7    
Fast Wide SCSI (Wide SCSI)     160 20    
Ultra SCSI (SCSI-3 / Fast-20 / Ultra Narrow)     160 20    
Ultra IDE     264 33    
Wide Ultra SCSI (Fast Wide 20)     320 40    
Ultra2 SCSI     320 40    
FireWire 400 - (IEEE1394a)     400 50    
USB 2.0     480 60    
Wide Ultra2 SCSI     640 80    
Ultra3 SCSI     640 80    
FireWire 800 - (IEEE1394b)     800 100    
Gigabit Ethernet     1000 125 1  
PCI - (33 MHz / 32-bit)     1064 133 1.064  
Serial ATA I - (SATA I)     1200 150 1.2  
Wide Ultra3 SCSI     1280 160 1.28  
Ultra160 SCSI     1280 160 1.28  
PCI - (33 MHz / 64-bit)     2128 266 2.128  
PCI - (66 MHz / 32-bit)     2128 266 2.128  
AGP 1x - (66 MHz / 32-bit)     2128 266 2.128  
Serial ATA II - (SATA II)     2400 300 2.4  
Ultra320 SCSI     2560 320 2.56  
FC-AL Fibre Channel     3200 400 3.2  
PCI-Express x1 - (bidirectional)     4000 500 4  
PCI - (66 MHz / 64-bit)     4256 532 4.256  
AGP 2x - (133 MHz / 32-bit)     4264 533 4.264  
Serial ATA III - (SATA III)     4800 600 4.8  
PCI-X - (100 MHz / 64-bit)     6400 800 6.4  
PCI-X - (133 MHz / 64-bit)       1064 8.512 1
AGP 4x - (266 MHz / 32-bit)       1066 8.528 1
10G Ethernet - (IEEE 802.3ae)       1250 10 1.25
PCI-Express x4 - (bidirectional)       2000 16 2
AGP 8x - (533 MHz / 32-bit)       2133 17.064 2.1
PCI-Express x8 - (bidirectional)       4000 32 4
PCI-Express x16 - (bidirectional)       8000 64 8



5. Hardware & Costs

The hardware used to build our example Oracle10g RAC environment consists of two Linux servers and components that can be purchased at any local computer store or over the Internet.

Server 1 - (linux1)
Dimension 2400 Series
  • Intel Pentium 4 Processor at 2.80GHz
  • 1GB DDR SDRAM (at 333MHz)
  • 40GB 7200 RPM Internal Hard Drive
  • Integrated Intel 3D AGP Graphics
  • Integrated 10/100 Ethernet
  • CDROM (48X Max Variable)
  • 3.5" Floppy
  • No monitor (Already had one)
  • USB Mouse and Keyboard
  • US$620
    1 - Ethernet LAN Cards

      Each Linux server should contain two NIC adapters. The Dell Dimension includes an integrated 10/100 Ethernet adapter that will be used to connect to the public network. The second NIC adapter will be used for the private interconnect.
    US$20
    1 - FireWire Card

      The following is a list of FireWire I/O cards that contain the correct chipset, allow for multiple logins, and should work with this article (no guarantees however):

         FireWire 400

         FireWire 800

    Warning: I was unable to obtain concurrent logins to the FireWire drive when using the LaCie FireWire 800 PCI Card - (107755) in both nodes. To resolve this, I used two different FireWire PCI cards: the LaCie FireWire 800 PCI Card - (107755) in one node and the SIIG FireWire 800 PCI-32T host adapter - (NN-830112) in the second node. Please refer to the section Troubleshooting Concurrent Logins to the FireWire Drive for further details.

    FireWire I/O cards with chipsets made by Texas Instruments (TI) or VIA Technologies (VIA) are known to work.
    US$30
    Server 2 - (linux2)
    Dimension 2400 Series
  • Intel Pentium 4 Processor at 2.80GHz
  • 1GB DDR SDRAM (at 333MHz)
  • 40GB 7200 RPM Internal Hard Drive
  • Integrated Intel 3D AGP Graphics
  • Integrated 10/100 Ethernet
  • CDROM (48X Max Variable)
  • 3.5" Floppy
  • No monitor (already had one)
  • USB Mouse and Keyboard
  • US$620
    1 - Ethernet LAN Cards

      Each Linux server should contain two NIC adapters. The Dell Dimension includes an integrated 10/100 Ethernet adapter that will be used to connect to the public network. The second NIC adapter will be used for the private interconnect.
    US$20
    1 - FireWire Card

      The following is a list of FireWire I/O cards that contain the correct chipset, allow for multiple logins, and should work with this article (no guarantees however):

         FireWire 400

         FireWire 800

    Warning: I was unable to obtain concurrent logins to the FireWire drive when using the LaCie FireWire 800 PCI Card - (107755) in both nodes. To resolve this, I used two different FireWire PCI cards: the LaCie FireWire 800 PCI Card - (107755) in one node and the SIIG FireWire 800 PCI-32T host adapter - (NN-830112) in the second node. Please refer to the section Troubleshooting Concurrent Logins to the FireWire Drive for further details.

    FireWire I/O cards with chipsets made by Texas Instruments (TI) or VIA Technologies (VIA) are known to work.
    US$30
    Miscellaneous Components
    FireWire Hard Drive
      The following is a list of FireWire drives (and enclosures) that contain the correct chipset, allow for multiple logins, and should work with this article (no guarantees however):
    FireWire 400

         FireWire 800

         Ensure that the FireWire drive that you purchase supports multiple logins. If the drive has a chipset that does not allow for concurrent access from more than one server, the disk and its partitions can only be seen by one server at a time. Disks with the Oxford 911 chipset (FireWire 400), Oxford 912 chipset (FireWire 800), or Oxford 922 chipset (FireWire 800) are known to work. Note that the Oxford 912 chipset is newer and faster than Oxford 922. Here are the details about the disk that I purchased for this test:
    Vendor: Maxtor
    Model: OneTouch II
    Mfg. Part No. or KIT No.: E01G300
    Capacity: 300 GB
    Cache Buffer: 16 MB
    Spin Rate: 7200 RPM
    Interface Transfer Rate: 400 Mbits/s
    "Combo" Interface: IEEE 1394 / USB 2.0 and USB 1.1 compatible
    US$280
    1 - Extra FireWire Cable

    Each node in the RAC configuration will need to connect to the shared storage device (the FireWire hard drive). The FireWire hard drive will come supplied with one FireWire cable. You will need to purchase one additional FireWire cable to connect the second node to the shared storage. Select the appropriate FireWire cable that is compatible with the data transmission speed (FireWire 400 / FireWire 800) and the desired cable length.

         FireWire 400

         FireWire 800

    US$20
    1 - Ethernet hub or switch

    Used for the interconnect between linux1-priv and linux2-priv. A question I often receive is about substituting the Ethernet switch (used for interconnect linux1-priv / linux2-priv) with a crossover CAT5 cable. I would not recommend this. I have found that when using a crossover CAT5 cable for the interconnect, whenever I took one of the PCs down, the other PC would detect a "cable unplugged" error, and thus the Cache Fusion network would become unavailable.
    US$25
    4 - Network Cables US$5
    US$5
    US$5
    US$5
    Total     US$1,685  

    Note that the Maxtor OneTouch external drive does have two IEEE1394 (FireWire) ports, although it may not appear so at first glance. This is also true for the other external hard drives I have listed above.

    Now that we know the hardware that will be used in this example, let's take a conceptual look at what the environment looks like:



    Figure 1 Architecture

    As we start to go into the details of the installation, keep in mind that most tasks will need to be performed on both servers. I will indicate at the beginning of each section whether or not the task(s) should be performed on both nodes or not.


    6. Install the Linux Operating System

    This section provides a summary of the screens used to install the Linux operating system. This guide is designed to work with the Red Hat Enterprise Linux 4 AS/ES (RHEL4) operating environment. As an alternative, and what I used for this article, is CentOS 4.2: a free and stable version of the RHEL4 operating environment.

    For more detailed installation instructions, it is possible to use the manuals from Red Hat Linux. I would suggest, however, that the instructions I have provided below be used for this configuration.

    Before installing the Linux operating system on both nodes, you should have the FireWire and two NIC interfaces (cards) installed.

    Also, before starting the installation, ensure that the FireWire drive (our shared storage drive) is NOT connected to either of the two servers. You may also choose to connect both servers to the FireWire drive and simply turn the power off to the drive. Although none of this is mandatory, it is how I will be performing the installation and configuration for this article.

    Download the following ISO images for CentOS 4.2:

    After downloading and burning the CentOS images (ISO files) to CD, insert CentOS Disk #1 into the first server (linux1 in this example), power it on, and answer the installation screen prompts as noted below. After completing the Linux installation on the first node, perform the same Linux installation on the second node while substituting the node name linux1 for linux2 and the different IP addresses where appropriate.

    Boot Screen
    The first screen is the CentOS Enterprise Linux boot screen. At the boot: prompt, hit [Enter] to start the installation process.

    Media Test
    When asked to test the CD media, tab over to [Skip] and hit [Enter]. If there were any errors, the media burning software would have warned us. After several seconds, the installer should then detect the video card, monitor, and mouse. The installer then goes into GUI mode.

    Welcome to CentOS Enterprise Linux
    At the welcome screen, click [Next] to continue.

    Language / Keyboard Selection
    The next two screens prompt you for the Language and Keyboard settings. Make the appropriate selections for your configuration.

    Installation Type
    Choose the [Custom] option and click [Next] to continue.

    Disk Partitioning Setup
    Select [Automatically partition] and click [Next] continue.

    If there were a previous installation of Linux on this machine, the next screen will ask if you want to "remove" or "keep" old partitions. Select the option to [Remove all partitions on this system]. Also, ensure that the [hda] drive is selected for this installation. I also keep the checkbox [Review (and modify if needed) the partitions created] selected. Click [Next] to continue.

    You will then be prompted with a dialog window asking if you really want to remove all partitions. Click [Yes] to acknowledge this warning.

    Partitioning
    The installer will then allow you to view (and modify if needed) the disk partitions it automatically selected. In almost all cases, the installer will choose 100MB for /boot, double the amount of RAM for swap, and the rest going to the root (/) partition. I like to have a minimum of 1GB for swap. For the purpose of this install, I will accept all automatically preferred sizes. (Including 2GB for swap since I have 1GB of RAM installed.)

    Starting with RHEL 4, the installer will create the same disk configuration as just noted but will create them using the Logical Volume Manager (LVM). For example, it will partition the first hard drive (/dev/hda for my configuration) into two partitions—one for the /boot partition (/dev/hda1) and the remainder of the disk dedicate to a LVM named VolGroup00 (/dev/hda2). The LVM Volume Group (VolGroup00) is then partitioned into two LVM partitions - one for the root filesystem (/) and another for swap. I basically check that it created at least 1GB of swap. Since I have 1GB of RAM installed, the installer created 2GB of swap. Saying that, I just accept the default disk layout.

    Boot Loader Configuration
    The installer will use the GRUB boot loader by default. To use the GRUB boot loader, accept all default values and click [Next] to continue.

    Network Configuration
    I made sure to install both NIC interfaces (cards) in each of the Linux machines before starting the operating system installation. This screen should have successfully detected each of the network devices.

    First, make sure that each of the network devices are checked to [Active on boot]. The installer may choose to not activate eth1.

    Second, [Edit] both eth0 and eth1 as follows. You may choose to use different IP addresses for both eth0 and eth1 and that is OK. If possible, try to put eth1 (the interconnect) on a different subnet than eth0 (the public network):

    eth0:
    - Check off the option to [Configure using DHCP]
    - Leave the [Activate on boot] checked
    - IP Address: 192.168.1.100
    - Netmask: 255.255.255.0

    eth1:
    - Check off the option to [Configure using DHCP]
    - Leave the [Activate on boot] checked
    - IP Address: 192.168.2.100
    - Netmask: 255.255.255.0

    Continue by setting your hostname manually. I used "linux1" for the first node and "linux2" for the second one. Finish this dialog off by supplying your gateway and DNS servers.

    Firewall
    On this screen, make sure to select [No firewall] and click [Next] to continue. You may be prompted with a warning dialog about not setting the firewall. If this occurs, simply hit [Proceed] to continue.

    Additional Language Support/Time Zone
    The next two screens allow you to select additional language support and time zone information. In almost all cases, you can accept the defaults.

    Set Root Password
    Select a root password and click [Next] to continue.

    Package Group Selection
    Scroll down to the bottom of this screen and select [Everything] under the "Miscellaneous" section. Click [Next] to continue.

    Please note that the installation of Oracle does not require all Linux packages to be installed. My decision to install all packages was for the sake of brevity. Please see section Section 15 ("Check RPM Packages for Oracle 10g Release 2") for a more detailed look at the critical packages required for a successful Oracle installation.

    Note that with some RHEL4 distributions, you will not get the "Package Group Selection" screen by default. There, you are asked to simply "Install default software packages" or "Customize software packages to be installed". Select the option to "Customize software packages to be installed" and click [Next] to continue. This will then bring up the "Package Group Selection" screen. Now, scroll down to the bottom of this screen and select [Everything] under the "Miscellaneous" section. Click [Next] to continue.

    About to Install
    This screen is basically a confirmation screen. Click [Next] to start the installation. During the installation process, you will be asked to switch disks to Disk #2, Disk #3, and then Disk #4. Click [Continue] to start the installation process.

    Note that with CentOS 4.2, the installer will ask to switch to Disk #2, Disk #3, Disk #4, Disk #1, and then back to Disk #4.

    Graphical Interface (X) Configuration
    With most RHEL4 distributions (not the case with CentOS 4.2), when the installation is complete, the installer will attempt to detect your video hardware. Ensure that the installer has detected and selected the correct video hardware (graphics card and monitor) to properly use the X Windows server. You will continue with the X configuration in the next serveral screens.

    Congratulations
    And that's it. You have successfully installed CentOS Enterprise Linux on the first node (linux1). The installer will eject the CD from the CD-ROM drive. Take out the CD and click [Exit] to reboot the system.

    When the system boots into Linux for the first time, it will prompt you with another Welcome screen. The following wizard allows you to configure the date and time, add any additional users, testing the sound card, and to install any additional CDs. The only screen I care about is the time and date (and if you are using CentOS 4.x, the monitor/display settings). As for the others, simply run through them as there is nothing additional that needs to be installed (at this point anyways!). If everything was successful, you should now be presented with the login screen.

    Perform the same installation on the second node
    After completing the Linux installation on the first node, repeat the above steps for the second node (linux2). When configuring the machine name and networking, ensure to configure the proper values. For my installation, this is what I configured for linux2:

    First, make sure that each of the network devices are checked to [Active on boot]. The installer will choose not to activate eth1.

    Second, [Edit] both eth0 and eth1 as follows:

    eth0:
    - Check off the option to [Configure using DHCP]
    - Leave the [Activate on boot] checked
    - IP Address: 192.168.1.101
    - Netmask: 255.255.255.0

    eth1:
    - Check off the option to [Configure using DHCP]
    - Leave the [Activate on boot] checked
    - IP Address: 192.168.2.101
    - Netmask: 255.255.255.0

    Continue by setting your hostname manually. I used "linux2" for the second node. Finish this dialog off by supplying your gateway and DNS servers.


    7. Network Configuration

    Perform the following network configuration on all nodes in the cluster!

    Note: Although we configured several of the network settings during the Linux installation, it is important to not skip this section as it contains critical steps that are required for the RAC environment.

    Introduction to Network Settings

    During the Linux O/S install you already configured the IP address and host name for each of the nodes. You now need to configure the /etc/hosts file as well as adjust several of the network settings for the interconnect.

    Each node should have one static IP address for the public network and one static IP address for the private cluster interconnect. The private interconnect should only be used by Oracle to transfer Cluster Manager and Cache Fusion related data. Although it is possible to use the public network for the interconnect, this is not recommended as it may cause degraded database performance (reducing the amount of bandwidth for Cache Fusion and Cluster Manager traffic). For a production RAC implementation, the interconnect should be at least gigabit or more and only be used by Oracle.

    Configuring Public and Private Network

    In our two-node example, you need to configure the network on both nodes for access to the public network as well as their private interconnect.

    The easiest way to configure network settings in RHEL4 is with the Network Configuration program. This application can be started from the command-line as the root user account as follows:

    # su -
    # /usr/bin/system-config-network &
    Do not use DHCP naming for the public IP address or the interconnects; you need static IP addresses!

    Using the Network Configuration application, you need to configure both NIC devices as well as the /etc/hosts file. Both of these tasks can be completed using the Network Configuration GUI. Notice that the /etc/hosts settings are the same for both nodes.

    Our example configuration will use the following settings:

    Server 1 (linux1)
    Device IP Address Subnet Gateway Purpose
    eth0 192.168.1.100 255.255.255.0 192.168.1.1 Connects linux1 to the public network
    eth1 192.168.2.100 255.255.255.0   Connects linux1 (interconnect) to linux2 (linux2-priv)
    /etc/hosts
    127.0.0.1 localhost loopback

    # Public Network - (eth0)
    192.168.1.100 linux1
    192.168.1.101 linux2

    # Private Interconnect - (eth1)
    192.168.2.100 linux1-priv
    192.168.2.101 linux2-priv

    # Public Virtual IP (VIP) addresses for - (eth0)
    192.168.1.200 linux1-vip
    192.168.1.201 linux2-vip

    Server 2 (linux2)
    Device IP Address Subnet Gateway Purpose
    eth0 192.168.1.101 255.255.255.0 192.168.1.1 Connects linux2 to the public network
    eth1 192.168.2.101 255.255.255.0   Connects linux2 (interconnect) to linux1 (linux1-priv)
    /etc/hosts
    127.0.0.1 localhost loopback

    # Public Network - (eth0)
    192.168.1.100 linux1
    192.168.1.101 linux2

    # Private Interconnect - (eth1)
    192.168.2.100 linux1-priv
    192.168.2.101 linux2-priv

    # Public Virtual IP (VIP) addresses for - (eth0)
    192.168.1.200 linux1-vip
    192.168.1.201 linux2-vip

    Note that the virtual IP addresses only need to be defined in the /etc/hosts file (or your DNS) for both nodes. The public virtual IP addresses will be configured automatically by Oracle when you run the Oracle Universal Installer, which starts Oracle's Virtual Internet Protocol Configuration Assistant (VIPCA). All virtual IP addresses will be activated when the srvctl start nodeapps -n <node_name> command is run. This is the Host Name/IP Address that will be configured in the client(s) tnsnames.ora file (more details later).

    In the screenshots below, only node 1 (linux1) is shown. Be sure to make all the proper network settings to both nodes.



    Figure 2 Network Configuration Screen, Node 1 (linux1)



    Figure 3 Ethernet Device Screen, eth0 (linux1)



    Figure 4 Ethernet Device Screen, eth1 (linux1)



    Figure 5: Network Configuration Screen, /etc/hosts (linux1)

    When the network if configured, you can use the ifconfig command to verify everything is working. The following example is from linux1:

    $ /sbin/ifconfig -a
    eth0 Link encap:Ethernet HWaddr 00:0D:56:FC:39:EC
    inet addr:192.168.1.100 Bcast:192.168.1.255 Mask:255.255.255.0
    inet6 addr: fe80::20d:56ff:fefc:39ec/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:835 errors:0 dropped:0 overruns:0 frame:0
    TX packets:1983 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:705714 (689.1 KiB) TX bytes:176892 (172.7 KiB)
    Interrupt:3
    eth1 Link encap:Ethernet HWaddr 00:0C:41:E8:05:37
    inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0
    inet6 addr: fe80::20c:41ff:fee8:537/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:0 (0.0 b) TX bytes:546 (546.0 b)
    Interrupt:11 Base address:0xe400
    lo Link encap:Local Loopback
    inet addr:127.0.0.1 Mask:255.0.0.0
    inet6 addr: ::1/128 Scope:Host
    UP LOOPBACK RUNNING MTU:16436 Metric:1
    RX packets:5110 errors:0 dropped:0 overruns:0 frame:0
    TX packets:5110 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:8276758 (7.8 MiB) TX bytes:8276758 (7.8 MiB)
    sit0 Link encap:IPv6-in-IPv4
    NOARP MTU:1480 Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

    About Virtual IP

    Why is there a Virtual IP (VIP) in 10g? Why does it just return a dead connection when its primary node fails?

    It's all about availability of the application. When a node fails, the VIP associated with it is supposed to be automatically failed over to some other node. When this occurs, two things happen.

    1. The new node re-arps the world indicating a new MAC address for the address. For directly connected clients, this usually causes them to see errors on their connections to the old address.
    2. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.

    This means that when the client issues SQL to the node that is now down, or traverses the address list while connecting, rather than waiting on a very long TCP/IP time-out (~10 minutes), the client receives a TCP reset. In the case of SQL, this is ORA-3113. In the case of connect, the next address in tnsnames is used.

    Going one step further is making use of Transparent Application Failover (TAF). With TAF successfully configured, it is possible to completely avoid ORA-3113 errors alltogether! TAF will be discussed in more detail in Section 28 ("Transparent Application Failover - (TAF)").

    Without using VIPs, clients connected to a node that died will often wait a 10-minute TCP timeout period before getting an error. As a result, you don't really have a good HA solution without using VIPs (Source - Metalink Note 220970.1).

    Confirm the RAC Node Name is Not Listed in Loopback Address

    Ensure that the node names (linux1 or linux2) are not included for the loopback address in the /etc/hosts file. If the machine name is listed in the in the loopback address entry as below:

    127.0.0.1 linux1 localhost.localdomain localhost
    it will need to be removed as shown below:
    127.0.0.1 localhost.localdomain localhost

    If the RAC node name is listed for the loopback address, you will receive the following error during the RAC installation:

    ORA-00603: ORACLE server session terminated by fatal error
    or
    ORA-29702: error occurred in Cluster Group Service operation

    Adjusting Network Settings

    With Oracle 9.2.0.1 and later, Oracle makes use of UDP as the default protocol on Linux for inter-process communication (IPC), such as Cache Fusion and Cluster Manager buffer transfers between instances within the RAC cluster.

    Oracle strongly suggests to adjust the default and maximum send buffer size (SO_SNDBUF socket option) to 256KB, and the default and maximum receive buffer size (SO_RCVBUF socket option) to 256KB.

    The receive buffers are used by TCP and UDP to hold received data until it is read by the application. The receive buffer cannot overflow because the peer is not allowed to send data beyond the buffer size window. This means that datagrams will be discarded if they don't fit in the socket receive buffer, potentially causing the sender to overwhelm the receiver.

    The default and maximum window size can be changed in the /proc file system without reboot:

    # su - root
    # sysctl -w net.core.rmem_default=262144
    net.core.rmem_default = 262144
    # sysctl -w net.core.wmem_default=262144
    net.core.wmem_default = 262144
    # sysctl -w net.core.rmem_max=262144
    net.core.rmem_max = 262144
    # sysctl -w net.core.wmem_max=262144
    net.core.wmem_max = 262144

    The above commands made the changes to the already running OS. You should now make the above changes permanent (for each reboot) by adding the following lines to the /etc/sysctl.conf file for each node in your RAC cluster:

    # Default setting in bytes of the socket receive buffer
    net.core.rmem_default=262144

    # Default setting in bytes of the socket send buffer
    net.core.wmem_default=262144

    # Maximum socket receive buffer size which may be set by using
    # the SO_RCVBUF socket option
    net.core.rmem_max=262144

    # Maximum socket send buffer size which may be set by using
    # the SO_SNDBUF socket option
    net.core.wmem_max=262144

    Check and turn off UDP ICMP rejections

    During the Linux installation process, I indicated to not configure the firewall option. By default the option to configure a firewall is selected by the installer. This has burned me several times so I like to do a double-check that the firewall option is not configured and to ensure udp ICMP filtering is turned off.

    If UDP ICMP is blocked or rejected by the firewall, the Oracle Clusterware software will crash after several minutes of running. When the Oracle Clusterware process fails, you will have something similar to the following in the <machine_name>_evmocr.log file:

    08/29/2005 22:17:19
    oac_init:2: Could not connect to server, clsc retcode = 9
    08/29/2005 22:17:19
    a_init:12!: Client init unsuccessful : [32]
    ibctx:1:ERROR: INVALID FORMAT
    proprinit:problem reading the bootblock or superbloc 22
    When experiencing this type of error, the solution was to remove the udp ICMP (iptables) rejection rule - or to simply have the firewall option turned off. The Oracle Clusterware software will then start to operate normally and not crash. The following commands should be executed as the root user account:

    1. Check to ensure that the firewall option is turned off. If the firewall option is stopped (like it is in my example below) you do not have to proceed with the following steps.
      # /etc/rc.d/init.d/iptables status
      Firewall is stopped.

    2. If the firewall option is operating you will need to first manually disable UDP ICMP rejections:
      # /etc/rc.d/init.d/iptables stop
      Flushing firewall rules: [ OK ]
      Setting chains to policy ACCEPT: filter [ OK ]
      Unloading iptables modules: [ OK ]

    3. Then, to turn UDP ICMP rejections off for next server reboot (which should always be turned off):
      # chkconfig iptables off


    8. Obtain & Install FireWire Modules

    Perform the following FireWire module install and configuration on all nodes in the cluster!

    The next step is to obtain and install the FireWire modules that support the use of IEEE1394 devices with multiple logins.

    In previous versions of this guide, it was required to download and install both a new Linux kernel, (e.g. the OTN-supplied 2.6.9-11.0.0.10.3.EL #1 Linux kernel), and the supporting FireWire modules. As of November 2005, oss.oracle.com now provides pre-compiled FireWire modules for the 2.6.9-22.EL and 2.6.9-22.0.1.EL Linux kernels. Installing a new Linux kernel is no longer required. We will only need to install and configure the supporting FireWire modules!

    I am using the term "multiple logins" a bit loosely in this article. The concept of "multiple logins" is strictly not allowed in the IEEE1394 specification, as it is only a point to point protocol. The term "multiple logins", is often confused with "concurrent sessions", which is supported in the IEEE1394 specification. It simply means that the device allows multiple outstanding requests simultaneously (similar to the SCSI protocol). Therefore multiple hosts (initiators) on a single bus are prohibited according to IEEE1394.

    In a previous version of this guide, I included the steps to download a patched version of the Linux kernel (the C source code) and then compile it. Thanks to Oracle's Linux Projects Development Team , this is no longer a requirement. Oracle now provides a pre-compiled module that supports the sharing of FireWire drives. The instructions for downloading and installing the supporting FireWire modules are included in this section. Before going into the details of how to perform these actions, however, let's take a moment to discuss the changes that are required in the new FireWire kernel driver.

    While FireWire drivers already exist for Linux, they often do not support shared storage. Typically when you logon to an OS, the OS associates the driver to a specific drive for that machine alone. This implementation simply will not work for our RAC configuration. The shared storage (our FireWire hard drive) needs to be accessed by more than one node. You need to enable the FireWire driver to provide nonexclusive access to the drive so that multiple servers — the nodes that comprise the cluster — will be able to access the same storage. This goal is accomplished by removing the bit mask that identifies the machine during login in the source code, resulting in nonexclusive access to the FireWire hard drive. All other nodes in the cluster login to the same drive during their logon session, using the same modified driver, so they too also have nonexclusive access to the drive.

    Our implementation describes a dual node cluster (each with a single processor), each server running CentOS 4.2 Enterprise Linux. Keep in mind that the process of installing the supporting FireWire modules will need to be performed on both Linux nodes. CentOS Enterprise Linux 4.2 includes kernel 2.6.9-22.EL #1. Knowing this, we now need to download the matching FireWire module from: Oracle Technet Supplied FireWire Modules

    Download one of the following files for the supporting FireWire Modules:

    oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm - (for single processor)

    or

    oracle-firewire-modules-2.6.9-22.ELsmp-1286-1.i686.rpm - (for multiple processors)

    Install the supporting FireWire modules, as root:

    Install the supporting FireWire modules package by running either of the following:

    # rpm -ivh oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm - (for single processor)
    - OR -
    # rpm -ivh oracle-firewire-modules-2.6.9-22.ELsmp-1286-1.i686.rpm - (for multiple processors)

    Add module options:

    Add the following lines to /etc/modprobe.conf:

    options sbp2 exclusive_login=0
    It is vital that the parameter sbp2 exclusive_login of the Serial Bus Protocol module (sbp2) be set to zero to allow multiple hosts to login to and access the FireWire disk concurrently.

    Perform the above tasks on the second Linux server:

    With the supporting FireWire modules installed on the first Linux server, move on to the second Linux server and repeat the same tasks in this section on it.

    Connect FireWire drive to each machine and boot with the new FireWire modules installed:

    After performing the above tasks on both nodes in the cluster, power down both Linux machines:

    ===============================

    # hostname
    linux1
    # init 0
    ===============================
    # hostname
    linux2
    # init 0
    ===============================
    After both machines are powered down, connect each of them to the back of the FireWire drive. Power on the FireWire drive. Finally, power on each Linux server one at a time and ensure to watch for the "Probing for New Hardware" section during the boot process.

    Note: RHEL 4 users will be prompted during the boot process on both nodes at the "Probing for New Hardware" section for your FireWire hard drive. Simply select the option to "Configure" the device and continue the boot process.

    If you are not prompted during the "Probing for New Hardware" section for the new FireWire drive, you will need to run the following commands and reboot the machine. Do not put these commands in a script and attempt to run them - run them interactively at the command-line:

    # rpm -e oracle-firewire-modules-2.6.9-22.EL-1286-1
    # rpm -Uvh oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm
    # modprobe -r sbp2
    # modprobe -r sd_mod
    # modprobe -r ohci1394
    # modprobe ohci1394
    # modprobe sd_mod
    # modprobe sbp2
    # /usr/sbin/kudzu
    # init 6

    After running /usr/sbin/kudzu (above), you should be prompted to "Configure" the new drive. There are times when this didn't work the first time. If it didn't work, I had to power down everything, power them back up and perform the modprobe tasks (above) again.

    Check and turn off UDP ICMP rejections:

    After rebooting each machine (above) check to ensure that the firewall option is turned off (stopped):

    # /etc/rc.d/init.d/iptables status
    Firewall is stopped.

    Loading the FireWire stack:

    In most cases, the loading of the FireWire stack will already be configured in the /etc/rc.sysinit file. The commands that are contained within this file that are responsible for loading the FireWire stack are:

    # modprobe sbp2
    # modprobe ohci1394
    In older versions of Red Hat, this was not the case and these commands would have to be manually run or put within a startup file. With Red Hat Enterprise Linux 3 and later, these commands are already put within the /etc/rc.sysinit file and run on each boot.

    Check for SCSI Device:

    After each machine has rebooted, the kernel should automatically detect the disk as a SCSI device (/dev/sdXX). This section will provide several commands that should be run on all nodes in the cluster to verify the FireWire drive was successfully detected and being shared by all nodes in the cluster.

    For this configuration, I was performing the above procedures on both nodes at the same time. The following commands and results are from my linux2 machine. Again, make sure that you run the following commands on all nodes to ensure both machine can login to the shared drive.

    Let's first check to see that the FireWire adapter was successfully detected:

    # lspci
    00:00.0 Host bridge: Intel Corporation 82845G/GL[Brookdale-G]/GE/PE DRAM Controller/Host-Hub Interface (rev 01)
    00:02.0 VGA compatible controller: Intel Corporation 82845G/GL[Brookdale-G]/GE Chipset Integrated Graphics Device (rev 01)
    00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 01)
    00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 01)
    00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 01)
    00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01)
    00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 81)
    00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface Bridge (rev 01)
    00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 01)
    00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 01)
    00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
    01:04.0 Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 11)
    01:06.0 FireWire (IEEE 1394): Texas Instruments TSB43AB23 IEEE-1394a-2000 Controller (PHY/Link)
    01:09.0 Ethernet controller: Broadcom Corporation BCM4401 100Base-T (rev 01)
    Second, let's check to see that the modules are loaded:
    # lsmod |egrep "ohci1394|sbp2|ieee1394|sd_mod|scsi_mod"
    sd_mod 17217 0
    ohci1394 35784 0
    sbp2 23948 0
    scsi_mod 121293 2 sd_mod,sbp2
    ieee1394 298228 2 ohci1394,sbp2
    Third, let's make sure the disk was detected and an entry was made by the kernel:
    # cat /proc/scsi/scsi
    Attached devices:
    Host: scsi0 Channel: 00 Id: 01 Lun: 00
    Vendor: Maxtor Model: OneTouch II Rev: 023g
    Type: Direct-Access ANSI SCSI revision: 06
    Now let's verify that the FireWire drive is accessible for multiple logins and shows a valid login:
    # dmesg | grep sbp2
    sbp2: $Rev: 1265 $ Ben Collins <bcollins@debian.org>
    ieee1394: sbp2: Maximum concurrent logins supported: 2
    ieee1394: sbp2: Number of active logins: 1
    ieee1394: sbp2: Logged into SBP-2 device
    From the above output, you can see that the FireWire drive I have can support concurrent logins by up to 2 servers. It is vital that you have a drive where the chipset supports concurrent access for all nodes within the RAC cluster.

    One other test I like to perform is to run a quick fdisk -l from each node in the cluster to verify that it is really being picked up by the OS. Your drive may show that the device does not contain a valid partition table, but this is OK at this point of the RAC configuration.

    # fdisk -l
    Disk /dev/hda: 40.0 GB, 40000000000 bytes
    255 heads, 63 sectors/track, 4863 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot Start End Blocks Id System
    /dev/hda1 * 1 13 104391 83 Linux
    /dev/hda2 14 4863 38957625 8e Linux LVM
    Disk /dev/sda: 300.0 GB, 300090728448 bytes
    255 heads, 63 sectors/track, 36483 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot Start End Blocks Id System
    /dev/sda1 1 36483 293049666 c W95 FAT32 (LBA)

    Rescan SCSI bus no longer required:

    In older versions of the kernel, I would need to run the rescan-scsi-bus.sh script in order to detect the FireWire drive. The purpose of this script was to create the SCSI entry for the node by using the following command:

    echo "scsi add-single-device 0 0 0 0" > /proc/scsi/scsi
    With RHEL3 and RHEL4, this step is no longer required and the disk should be detected automatically.

    Troubleshooting SCSI Device Detection:

    If you are having troubles with any of the procedures (above) in detecting the SCSI device, you can try the following:

    # rpm -e oracle-firewire-modules-2.6.9-22.EL-1286-1
    # rpm -Uvh oracle-firewire-modules-2.6.9-22.EL-1286-1.i686.rpm
    # modprobe -r sbp2
    # modprobe -r sd_mod
    # modprobe -r ohci1394
    # modprobe ohci1394
    # modprobe sd_mod
    # modprobe sbp2
    # /usr/sbin/kudzu
    You may also want to unplug any USB devices connected to the server. The system may not be able to recognize your FireWire drive if you have a USB device attached!

    Troubleshooting Concurrent Logins to the FireWire Drive:

    One of the first things to verify is that you are using a FireWire drive that contains the correct chipset and allows for multiple logins. If the FireWire drive has a chipset that does not allow for concurrent access from more than one server, the disk and its partitions can only be seen by one server at a time. Disks with the Oxford 911 chipset (FireWire 400), Oxford 912 chipset (FireWire 800), or Oxford 922 chipset (FireWire 800) are known to work. Note that the Oxford 912 chipset is newer and faster than Oxford 922. For a full list of FireWire drives (and enclosures) I have tested, please see the section Verified and Tested FireWire Hard Drives.

    Although I have only run into this situation once, there can be problems with the FireWire cards (the IEEE1394 controller cards). For example, in one of my tests (using FireWire 800) I was unable to obtain concurrent logins to the FireWire drive when using the LaCie FireWire 800 PCI Card - (107755) in both nodes. While the first node was able to login to the FireWire drive, it was acquiring it for exclusive access and causing the second node to fail its login process to the drive. For example:

    From linux1:
    # dmesg | grep sbp2
    sbp2: $Rev: 1265 $ Ben Collins
    ieee1394: sbp2: Maximum concurrent logins supported: 2
    ieee1394: sbp2: Number of active logins: 0
    ieee1394: sbp2: Logged into SBP-2 device
    From linux2:
    # dmesg | grep sbp2
    sbp2: $Rev: 1265 $ Ben Collins
    ieee1394: sbp2: Maximum concurrent logins supported: 2
    ieee1394: sbp2: Number of active logins: 1
    ieee1394: sbp2: Error logging into SBP-2 device - login failed
    sbp2: probe of 00d04b690809290b-0 failed with error -16
    ieee1394: sbp2: Maximum concurrent logins supported: 2
    ieee1394: sbp2: Number of active logins: 1
    ieee1394: sbp2: Error logging into SBP-2 device - login failed
    sbp2: probe of 00d04b690809290b-0 failed with error -16

    I have seen postings that indicate this can be resolved by using the sbp2 option "serialize_io=1" defined in the the /etc/modprobe.conf. For example, the entry in the /etc/modprobe.conf file would be:

    options sbp2 serialize_io=1 exclusive_login=0

    Although this has been used to resolve some of the cases with failed concurrent logins, it did not resolve the problem I was having with installing the LaCie FireWire 800 PCI Card - (107755) in both nodes.

    Another solution to resolve failed concurrent logins is to use different FireWire cards for each of the nodes. For example, one that uses the Texas Instruments (TI) chipset and another that users the VIA Technologies (VIA) chipset. Actually for me, I was able to resolve this by simply using two different FireWire cards from different vendors. For example, I used the LaCie FireWire 800 PCI Card - (107755) in one node and the SIIG FireWire 800 PCI-32T host adapter - (NN-830112) in the second node. Although they both use the TI chipset, it was enough to resolve the problem I was having with failed concurrent logins. After this, both nodes were able to successfully login to the FireWire drive.


    9. Create "oracle" User and Directories (both nodes)

    Perform the following tasks on all nodes in the cluster!

    You will be using OCFS2 to store the files required to be shared for the Oracle Clusterware software. When using OCFS2, the UID of the UNIX user oracle and GID of the UNIX group dba should be identical on all machines in the cluster. If either the UID or GID are different, the files on the OCFS file system may show up as "unowned" or may even be owned by a different user. For this article, I will use 175 for the oracle UID and 115 for the dba GID.

    Create Group and User for Oracle

    Let's continue our example by creating the Unix dba group and oracle user account along with all appropriate directories.

    # mkdir -p /u01/app
    # groupadd -g 115 dba
    # useradd -u 175 -g 115 -d /u01/app/oracle -s /bin/bash -c "Oracle Software Owner" -p oracle oracle
    # chown -R oracle:dba /u01
    # passwd oracle
    # su - oracle

    Note: When you are setting the Oracle environment variables for each RAC node, ensure to assign each RAC node a unique Oracle SID! For this example, I used:

    • linux1 : ORACLE_SID=orcl1
    • linux2 : ORACLE_SID=orcl2
    After creating the "oracle" UNIX userid on both nodes, ensure that the environment is setup correctly by using the following .bash_profile:

    ....................................
    # .bash_profile
    # Get the aliases and functions
    if [ -f ~/.bashrc ]; then
    . ~/.bashrc
    fi
    alias ls="ls -FA"
    # User specific environment and startup programs
    export ORACLE_BASE=/u01/app/oracle
    export ORACLE_HOME=$ORACLE_BASE/product/10.2.0/db_1
    export ORA_CRS_HOME=$ORACLE_BASE/product/crs
    export ORACLE_PATH=$ORACLE_BASE/common/oracle/sql:.:$ORACLE_HOME/rdbms/admin
    # Each RAC node must have a unique ORACLE_SID. (i.e. orcl1, orcl2,...)
    export ORACLE_SID=orcl1
    export PATH=.:${PATH}:$HOME/bin:$ORACLE_HOME/bin
    export PATH=${PATH}:/usr/bin:/bin:/usr/bin/X11:/usr/local/bin
    export PATH=${PATH}:$ORACLE_BASE/common/oracle/bin
    export ORACLE_TERM=xterm
    export TNS_ADMIN=$ORACLE_HOME/network/admin
    export ORA_NLS10=$ORACLE_HOME/nls/data
    export LD_LIBRARY_PATH=$ORACLE_HOME/lib
    export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$ORACLE_HOME/oracm/lib
    export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
    export CLASSPATH=$ORACLE_HOME/JRE
    export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/jlib
    export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/rdbms/jlib
    export CLASSPATH=${CLASSPATH}:$ORACLE_HOME/network/jlib
    export THREADS_FLAG=native
    export TEMP=/tmp
    export TMPDIR=/tmp
    ....................................

    Create Mount Point for OCFS2 / Clusterware

    Finally, create the mount point for the OCFS2 filesystem that will be used to store the two Oracle Clusterware shared files. These commands will need to be run as the "root" user account:

    $ su -
    # mkdir -p /u02/oradata/orcl
    # chown -R oracle:dba /u02

    Ensure Adequate temp Space for OUI

    Note: The Oracle Universal Installer (OUI) requires at most 400MB of free space in the /tmp directory.

    You can check the available space in /tmp by running the following command:

    # cat /proc/swaps
    Filename Type Size Used Priority
    /dev/mapper/VolGroup00-LogVol01 partition 2031608 0 -1

    -OR-

    # cat /proc/meminfo | grep SwapTotal
    SwapTotal: 2031608 kB

    If for some reason you do not have enough space in /tmp, you can temporarily create space in another file system and point your TEMP and TMPDIR to it for the duration of the install. Here are the steps to do this:

    # su -
    # mkdir /<AnotherFilesystem>/tmp
    # chown root.root /<AnotherFilesystem>/tmp
    # chmod 1777 /<AnotherFilesystem>/tmp
    # export TEMP=/<AnotherFilesystem>/tmp # used by Oracle
    # export TMPDIR=/<AnotherFilesystem>/tmp # used by Linux programs
    # like the linker "ld"
    When the installation of Oracle is complete, you can remove the temporary directory using the following:
    # su -
    # rmdir /<AnotherFilesystem>/tmp
    # unset TEMP
    # unset TMPDIR

    10. Create Partitions on the Shared FireWire Storage Device

    Create the following partitions on only one node in the cluster!

    The next step is to create the required partitions on the FireWire (shared) drive. As I mentioned previously, you will use OCFS2 to store the two files to be shared for Oracle's Clusterware software. You will then create three ASM volumes; two for all physical database files (data/index files, online redo log files, and control files) and one for the Flash Recovery Area (RMAN backups and archived redo log files).

    The following table lists the individual partitions that will be created on the FireWire (shared) drive and what files will be contained on them.

    Oracle Shared Drive Configuration
    File System Type Partition Size Mount Point ASM Diskgroup Name File Types
    OCFS2 /dev/sda1 1GB /u02/oradata/orcl   Oracle Cluster Registry File - (~100MB)
    CRS Voting Disk - (~20MB)
    ASM /dev/sda2 50GB ORCL:VOL1 +ORCL_DATA1 Oracle Database Files
    ASM /dev/sda3 50GB ORCL:VOL2 +ORCL_DATA1 Oracle Database Files
    ASM /dev/sda4 100GB ORCL:VOL3 +FLASH_RECOVERY_AREA Oracle Flash Recovery Area
    Total   201GB      

    Create All Partitions on FireWire Shared Storage

    As shown in the table above, my FireWire drive shows up as the SCSI device /dev/sda. The fdisk command is used in Linux for creating (and removing) partitions. For this configuration, we will be creating four partitions: one for Oracle's Clusterware shared files and the other three for ASM (to store all Oracle database files and the Flash Recovery Area). Before creating the new partitions, it is important to remove any existing partitions (if they exist) on the FireWire drive:

    # fdisk /dev/sda
    Command (m for help): p
    Disk /dev/sda: 300.0 GB, 300090728448 bytes
    255 heads, 63 sectors/track, 36483 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot Start End Blocks Id System
    /dev/sda1 1 36483 293049666 c W95 FAT32 (LBA)
    Command (m for help): d
    Selected partition 1
    Command (m for help): p
    Disk /dev/sda: 300.0 GB, 300090728448 bytes
    255 heads, 63 sectors/track, 36483 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot Start End Blocks Id System
    Command (m for help): n
    Command action
    e extended
    p primary partition (1-4)
    p
    Partition number (1-4): 1
    First cylinder (1-36483, default 1): 1
    Last cylinder or +size or +sizeM or +sizeK (1-36483, default 36483): +1G
    Command (m for help): n
    Command action
    e extended
    p primary partition (1-4)
    p
    Partition number (1-4): 2
    First cylinder (124-36483, default 124): 124
    Last cylinder or +size or +sizeM or +sizeK (124-36483, default 36483): +50G
    Command (m for help): n
    Command action
    e extended
    p primary partition (1-4)
    p
    Partition number (1-4): 3
    First cylinder (6204-36483, default 6204): 6204
    Last cylinder or +size or +sizeM or +sizeK (6204-36483, default 36483): +50G
    Command (m for help): n
    Command action
    e extended
    p primary partition (1-4)
    p
    Selected partition 4
    First cylinder (12284-36483, default 12284): 12284
    Last cylinder or +size or +sizeM or +sizeK (12284-36483, default 36483): +100G
    Command (m for help): p
    Disk /dev/sda: 300.0 GB, 300090728448 bytes
    255 heads, 63 sectors/track, 36483 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot Start End Blocks Id System
    /dev/sda1 1 123 987966 83 Linux
    /dev/sda2 124 6203 48837600 83 Linux
    /dev/sda3 6204 12283 48837600 83 Linux
    /dev/sda4 12284 24442 97667167+ 83 Linux
    Command (m for help): w
    The partition table has been altered!
    Calling ioctl() to re-read partition table.
    Syncing disks.

    After creating all required partitions, you should now inform the kernel of the partition changes using the following syntax as the root user account:

    # partprobe
    # fdisk -l /dev/sda
    Disk /dev/sda: 300.0 GB, 300090728448 bytes
    255 heads, 63 sectors/track, 36483 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Device Boot Start End Blocks Id System
    /dev/sda1 1 123 987966 83 Linux
    /dev/sda2 124 6203 48837600 83 Linux
    /dev/sda3 6204 12283 48837600 83 Linux
    /dev/sda4 12284 24442 97667167+ 83 Linux
    (Note: The FireWire drive and partitions created will be exposed as a SCSI device.)
    Page 1  Page 2  Page 3

    E-mail this page
    Printer View Printer View
    Oracle Is The Information Company About Oracle | Oracle RSS Feeds | Careers | Contact Us | Site Maps | Legal Notices | Terms of Use | Privacy