Ranjit's Oracle Blogs: Installing Oracle 10g Real Application Cluster (RAC) (10.2.0.1) 32-bit on Redhat AS 3 x86 (RHEL3) / CentOS 3 x86

Pre-Installation Task:

Task List:

Minimum Hardware required
Technical Architecture of 2-node RAC
Download Oracle 10g RDBMS softwares from OTN
Software Redhat Packages Required
Memory and Swap Space
Setting up Kernel Paramemeter
Configuring the Public, Private and Virtual Hosts / Network
Creating oracle User account.
Creating required Directories for Oracle 10g R2 RAC software and setting up correct Permission
Setup shell Limits for the oracle user
Enable SSH oracle user Equivalency on all the cluster nodes.
Configuring System For Firewire Shared Disk
Partitioning Shared Disk
Installing and Configuring OCFS (Oracle Cluster File System)
Creating ASM disks using oracleasm (ASMLib IO) For the Clustered Database
Checking the Configuration of hangcheck-timer module

Required Hardware:

To create 2-node RAC, one would require 2 machine with the following hardware installed on it.

Per Node:
1 GB RAM, at least 8 GB of hard drive, 1 GHz CPU,
1 Firewire Controller, 1 Firewire Cable
2 NIC Ethernet card (one for public and another for private / interconnect network)

Per Cluster:
1 Shared Hard drive
1 Firewire HUB + 1 Firewire cable (for cluster with more than 2 node)
1 Network HUB + 1 network cable (for cluster with more than 2 node)
1 crossover network cable (for cluster with 2 node)
n number of network cable for private network for internode communication (for cluster with n nodes where n >=3)
n number of network cable for public network (for cluster with n nodes where n >=3)

I used the below hardware to build my 2-node rac.

Server 1	Dell Intel PIII 1.3 GHz, 256 MB RAM, 20 GB HD	$200 - Used one
Server 2	Dell Intel PIII 1.3 GHz, 256 MB RAM, 20 GB HD	$200 - Used one
Upgrade Memory to 512MB	256 MB x 2 for Both the Server	$110
Firewire Hard Drive	LaCie Firewire Hard Drive 120 GB	$160
Firewire Controllers	Adaptec AFW-4300 x 2 (for both the server) - Texas Instrument chipset	$98
Firewire HUB	Belkin's Firewire 6-Port Hub	$55
Firewire Cables	1 extra Firewire cable for other node	$15
NICs	D-Link Ethernet card x2	$30
Network Hub	"NETWORK Everywhere"10/100 5-Port Hub	$30
Crossover cable	-------	$15

Total Cost: $913.00

Technical Architecture of 2 node RAC:

Clustered Database Name: RacDB

Node1:

SID: RacDB1
Public Network name (hostname): node1-pub, eth0
Private network Name (for Interconnect): node1-prv, eth1
ORACLE_BASE: /u01/app/oracle
DB file location: +ASM/{DB_NAME}/
CRS file Location: /u02/oradata/ocr mounted on /dev/sda1 (ocfs)

Node2: SID: RacDB2
Public Network name (hostname): node2-pub, eth0
Private network Name (for Interconnect): node2-prv, eth1
ORACLE_BASE: /u01/app/oracle
DB file location: +ASM/{DB_NAME}/

CRS file Location: /u02/oradata/ocr mounted on /dev/sda1 (ocfs)

Obtaining Oracle 10g Software:

Goto otn.oracle.com and download the appropriate Oracle 10g Softwares into the /tmp. Make Sure You have enough space under this mount point. You can check this using df command. I downloaded 10201_database_linux32.zip (668,734,007 bytes) (cksum - 2737423041) file for my 32-bit Linux box. As I am going to create multi-Instance Database (RAC), I also needed to download 10g clusterware 10201_clusterware_linux32.zip (228,239,016 bytes) (cksum - 2639036338). These files come with a .zip extension which needs to be unzipped using the unzip utility which is installed as part of CentOS. In case you do not have one; you can get it from here. After Unzipping this file, you can optionally write them on the CD. I generally prefer cdrecord command.

I used a CD media of 700MB capacity to copy 10g (10.2.0.1) on it.

[root@localhost root]# unzip /tmp/0201_database_linux32.zip
[root@localhost root]# mkisofs -r /tmp/databases | cdrecord -v dev=1,1,0 speed=20 –

If you are installing the software from disc, mount the first disc if it is not already mounted. Some platforms automatically mount the disc when you insert the disc into the drive.

Packages:

After you install the Linux system and before you start installing Oracle10g software, please make sure that you have the below packages installed on your Linux box, else you will get error(s) during the installation process.

make-3.79.1
gcc-3.2.3-34
glibc-2.3.2-95.20
compat-db-4.0.14-5
compat-gcc-7.3-2.96.128
compat-gcc-c++-7.3-2.96.128
compat-libstdc++-7.3-2.96.128
compat-libstdc++-devel-7.3-2.96.128
openmotif21-2.1.30-8
setarch-1.3-1
libaio-0.3.103-3

Please execute the below command as root to make sure that you have this rpms installed. If not installed, then download them from appropriate Linux site.

rpm -q make gcc glibc compat-db compat-gcc compat-gcc-c++ compat-libstdc++ \
compat-libstdc++-devel openmotif21 setarch libaio libaio-devel

Perform this step on all the nodes.

Memory and swap Space:

Oracle 10g RAC requires to have 1GB of RAM available on each node to sucessfully install 10g RAC. Well, I have somehow managed to install it with 512 MB RAM. You will get warning during checking of pre-requisite step of installation step which you can ignore. Please go to Adding an Extra Swapspace if you want to have an extra Swapspace added.

Kernel Parameters:

Please go to Setting Up kernel Parameter to set the kernel parameters.

Configuring Public and Private network for the Cluster Nodes:

Each node in the cluster must have 2 network adapter (eth0, eth1) one for the public and another for the private network interface (internode communication, interconnect). You make sure that if you configure eth1 as the private interface for node1 then, eth1 must be configured as private interface for the node2.

Follow the below steps to configure these networks:

(1) Change the hostname value by executing the below command:

For Node node1-pub:
[root@localhost root]# hostname node1-pub
For Node node2-pub:
[root@localhost root]# hostname node2-pub

(2) Edit the /etc/hosts file as shown below

[root@localhost root]# cat /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

# Hostname for the Public Nodes in the RAC (eth0)

216.160.37.154 node1-pub.oracledba.org node1-pub

216.160.37.156 node2-pub.oracledba.org node2-pub

# Hostname for the Private Nodes in the RAC (eth1)

192.168.203.1 node1-prv.oracledba.org node1-prv

192.168.203.2 node2-prv.oracledba.org node2-prv

# Hostname for the Virtual IP in the RAC (eth1)

192.168.203.11 node1-vip.oracledba.org node1-vip

192.168.203.22 node2-vip.oracledba.org node2-vip

[root@node2-pub root]#

(3) Edit OR create the /etc/sysconfig/network-scripts/ifcfg-eth0 as shown below:

If you have static IPs: create the same file on both the nodes as shown below.

[root@localhost root]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
If you DO NOT have static IPs:
Add the entries like below into /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
IPADDR=192.168.10.1 -- For node1-pub Node
IPADDR=192.168.10.2 -- For node2-pub Node

(4) Edit OR create the /etc/sysconfig/network-scripts/ifcfg-eth1 as shown below:

For Node node1-pub:
[root@localhost root]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
PEERDNS=no
IPADDR=192.168.203.1

For Node node2-pub:

[root@localhost root]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
PEERDNS=no
IPADDR=192.168.203.2

(5) Edit the /etc/sysconfig/network file with the below contents:

For Node node1-pub:
[root@localhost root]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node1-pub

For Node node2-pub:
[root@localhost root]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node2-pub

(6) Restart the network service OR reboot the nodes:

After I rebooted both the nodes, I verified the network interface configurations by running the ifconfig command as shown below.

[root@node2-pub root]# ifconfig

Creating oracle OS User Account:

You need OS “oracle” user account created which owns the Oracle software. Oracle Software installation needs to be proceeds by this account. Oracle software installation (without Companion CD) requires 6 GB of free space available for the ORACLE_BASE directory. Please make sure that the mount point where you plan to install Software has required free space available. You can use “df –k” to check this out.

[root@node2-pub root]# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda2             18113556   3923072 13270364 23% /
/dev/hda1               101089     14036     81834 15% /boot
none                    126080         0    126080   0% /dev/shm

I had about 13GB of free space available on “/” mount point. So I decided to install Oracle under this mount point. RAC requires to have oracle user account created on all the nodes with the same
user id and group id. So, create oracle user account with this property by executing the below series of command on all the RAC nodes.

groupadd -g 900 dba

groupadd -g 901 oinstall
useradd -u 900 -g oinstall -G dba oracle
passwd oracle

Please verify that oracle user has same gid and uid on all the RAC nodes by executing the this command
[oracle@node2-pub oracle]$ id
uid=900(oracle) gid=901(oinstall) groups=901(oinstall),900(dba)

[oracle@node1-pub oracle]$ id
uid=900(oracle) gid=901(oinstall) groups=901(oinstall),900(dba)

Creating Oracle Software Directories:

Perform the below steps on all the nodes in cluster.

[root@node2-pub root]# mkdir -p /u01/app/oracle
[root@node2-pub root]# mkdir -p /u02/oradata/ocr -- mountpoint for the OCR files
[root@node2-pub root]# chown -R oracle:oinstall /u01
[root@node2-pub root]# chown -R oracle:oinstall /u02
[root@node2-pub root]# chmod -R 775 /u01/app/oracle
[root@node2-pub root]# chmod -R 775 /u02

Setting Shell Limits for the Oracle User:
Please go to Setting up shell limits for the oracle user to set the shell limits for the oracle user.

Enable SSH oracle user Equivalency on Both the Cluster Nodes:

To configure SSH user equivalency, you must create RSA and DSA keys on each cluster node and copy these keys from all the cluster node members into an authorized key file on each node. Follow the below steps to achieve this task.
su - oracle
mkdir ~/.ssh
chmod 700 ~/.ssh
(A) Generate the RSA and DSA keys on all the RAC Nodes:

/usr/bin/ssh-keygen -t rsa
/usr/bin/ssh-keygen -t dsa

(B) Add keys to the authorized key file and then send the same file to every nodes in cluster:

touch ~/.ssh/authorized_keys
cd ~/.ssh
(1)

ssh node1-pub cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node1-pub cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
ssh node2-pub cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node2-pub cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys

(2) Copy the authorized keys file to every nodes. For e.g, from node2-pub, I used the below command to copy the node2-pub's authorized key file to node1-pub node.

[oracle@node2-pub .ssh]$ scp authorized_keys node1-pub:/home/oracle/.ssh/

[oracle@node2-pub .ssh]$ chmod 600 ~/.ssh/authorized_keys

During executing step B - (1), you may be prompted as show below. Enter "yes" and continue.
[oracle@node2-pub .ssh]$ ssh node1-pub cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
The authenticity of host 'node1-pub (216.160.37.154)' can't be established.
RSA key fingerprint is <**********>.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node1-pub,216.160.37.154' (RSA) to the list of known hosts.
Warning: No xauth data; using fake authentication data for X11 forwarding.
Now, try executing the date (or any other command) on all the nodes to make sure that oracle is not asked for the password. You should not receive any error message while you execute these commands on all the nodes. If you get any error, first fix them before you go further.
ssh node1-prv date
ssh node2-prv date
ssh node1-pub date
ssh node2-pub date
Errors / Warnings during the network configurations:

I got the below warning when I tried below command.
[oracle@node2-pub .ssh]$ ssh node1-pub date
Warning: No xauth data; using fake authentication data for X11 forwarding.
Sun Dec 18 02:04:52 CST 2005
To fix the above warning, create the /home/oracle/.ssh/config file (logged in as oracle user) and make the below entry in it. Then run the same command again and the above warning would not show up.
[oracle@node2-pub oracle]$ cat .ssh/config

Host *
Forwardx11 no
It is observed that when you execute the below command, you will be prompted to enter 'yes' or 'no'. Simply enter yes and continue. Afterwards, when oracle connect to the remote node, it won’t be asked for the password. this is explained as below where oracle received below message when it tried to run the date command on the remote node using ssh very first time. Afterwards,
it does not get any message like this.
[oracle@node2-pub oracle]$ ssh node1-prv date
The authenticity of host 'node1-prv (192.168.203.1)' can't be established.
RSA key fingerprint is <********************************************>
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node1-prv,192.168.203.1' (RSA) to the list of known hosts.
Sun Dec 18 20:01:09 CST 2005
[oracle@node2-pub oracle]$ ssh node1-prv date
Sun Dec 18 20:01:13 CST 2005
[oracle@node2-pub oracle]$

[oracle@node2-pub oracle]$ ssh node2-prv date
Warning: Permanently added the RSA host key for IP address '192.168.203.2' to the list of known hosts.
Sun Dec 18 20:14:16 CST 2005
[oracle@node2-pub oracle]$ ssh node2-pub date
Sun Dec 18 20:15:05 CST 2005
If you get then below error message when try to connect to remote node, please make sure that the firewall is disabled on the remote node.

[root@node2-pub root]# telnet node1-prv
Trying 192.168.203.1...
telnet: Unable to connect to remote host: No route to host
Configuring System for Shared Disk Storage Device (Firewire):

Every node in the cluster must have access to the shared disk. So the shared disk must support the concurrent access to all nodes in cluster in order to successfully build 10g RAC. I chose Firewire Disk as a shared storage media because it is a cost effective solution if you just want to have hands-on practice on 10g RAC without investing more money. After you install the Redhat LINUX AS 3 system into both the node, please go to the http://oss.oracle.com/projects/firewire/files and download the appropriate Firewire kernel to support Firewire HD. I downloaded and installed the below rpms.

[root@localhost root]# uname -r
2.4.21-37.EL

kernel-2.4.21-27.0.2.ELorafw1.i686.rpm

[root@localhost root]# rpm -ivh --force kernel-2.4.21-27.0.2.ELorafw1.i686.rpm

This will also update the /etc/grub.conf file with the added entry of this new Firewire kernel. in the below file, default is set to 1 which means that the system will use the original kernel by default. If you want to make the newly added Firewire kernel as default, you can simply change the default=1 to default=0. It is required to set this kernel to default in the situation if this node is restarted by hangcheck-timer or for any other reason, then it should be rebooted with the right kernel.

[root@node2-pub root]# cat /etc/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE: You have a /boot partition. This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/hda2
#          initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title CentOS (2.4.21-27.0.2.ELorafw1)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-27.0.2.ELorafw1 ro root=LABEL=/
        initrd /initrd-2.4.21-27.0.2.ELorafw1.img
title CentOS-3 (2.4.21-37.EL)
        root (hd0,0)
        kernel /vmlinuz-2.4.21-37.EL ro root=LABEL=/
        initrd /initrd-2.4.21-37.EL.img
Also update the /etc/modules.conf file and add the below lines at the end of file on BOTH THE NODES. This will load the Firewire kernel modules and drivers at reboot.

alias ieee1394-controller ohci1394
options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-install sbp2 insmod ohci1394
post-remove sbp2 rmmod sd_mod

Now, Shutdown both the nodes and then connect the Firewire shared disk to them. power on the Firewire disk and then restart both the nodes using the new Firewire kernel 2.4.21-27.0.2.ELorafw1 one by one. Confirm that the Firewire disk is visible from both the nodes by running the below command as root on both the node.
[root@localhost root]# dmesg | grep ieee1394
ieee1394: Host added: Node[00:1023] GUID[0000d1008016f8e8] [Linux OHCI-1394]
ieee1394: Device added: Node[01:1023] GUID[00d04b3b1905e049] [LaCie Group SA ]
ieee1394: sbp2: Query logins to SBP-2 device successful
ieee1394: sbp2: Maximum concurrent logins supported: 4
ieee1394: sbp2: Number of active logins: 0
ieee1394: sbp2: Logged into SBP-2 device
ieee1394: sbp2: Node[01:1023]: Max speed [S400] - Max payload [2048]
ieee1394: Device added: Node[00:1023] GUID[00309500a0042ef9] [Linux OHCI-1394]
ieee1394: Node 00:1023 changed to 01:1023
ieee1394: Node 01:1023 changed to 02:1023
ieee1394: sbp2: Reconnected to SBP-2 device
ieee1394: sbp2: Node[02:1023]: Max speed [S400] - Max payload [2048]
[root@localhost root]# dmesg | grep sda
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB)
sda:

Partitioning the Shared disk:

You need atleast two partitions to be created on if you want to go for ASM as an storage options. One or more partition(s) will be used as ASM disk(s) and one partition is required to store Oracle's CRS (Cluster Ready Service) files. You cannot create these files under ASM. They need to be either created under the raw partition (device) or can be on ocfs. This document covers all the options of storing the database files on shared disk. If you want to use the entire disk as ocfs or raw device (volume), then no need to create separate partition for the CRS files. I partitioned the disks as shown below by connecting to any one of the nodes.

As I am going to use ASM for the database files and so I will use ocfs for the CRS files.
[root@node2-pub root]# fdisk /dev/sda

The number of cylinders for this disk is set to 24792.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4):
Value out of range.
Partition number (1-4): 1
First cylinder (1-24792, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-24792, default 24792): +300M

Command (m for help): p

Disk /dev/sda: 203.9 GB, 203928109056 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id System
/dev/sda1             1        37    297171   83 Linux

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (38-24792, default 38):
Using default value 38
Last cylinder or +size or +sizeM or +sizeK (38-24792, default 24792): +70000M

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (8549-24792, default 8549):
Using default value 8549
Last cylinder or +size or +sizeM or +sizeK (8549-24792, default 24792): +70000M

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Selected partition 4
First cylinder (17060-24792, default 17060):
Using default value 17060
Last cylinder or +size or +sizeM or +sizeK (17060-24792, default 24792):
Using default value 24792

Command (m for help): p

Disk /dev/sda: 203.9 GB, 203928109056 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id System
/dev/sda1             1        37    297171   83 Linux   -- will be used by CRS files (ocfs)
/dev/sda2            38      8548 68364607+ 83 Linux   -- will be used for ASM DSK1
/dev/sda3          8549     17059 68364607+ 83 Linux   -- reserved for another Clustered database on ocfs
/dev/sda4         17060     24792 62115322+ 83 Linux   -- reserved for another Clustered database on raw device

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
[root@node2-pub root]# partprobe -- [ Perform this step on all the nodes in cluster]
[root@node2-pub root]#

Installing and Configuring OCFS (Oracle Cluster File System):

We have 3 storage options to store the clustered database file on the shared disk.

   (1) Traditional raw device option (9i).
   (2) ocfs (also available in 9i).
   (3) ASM (only in 10g and above).

I prefer to use the ASM over the ocfs file system. I will only use ocfs partition to store OCR file as this file needs to be on the shared disk. You can also use the raw device option to store them.

Download and Install the required rpms:

Please download the below rpms from Oracle's website and install them as shown.
ocfs-2.4.21-EL-1.0.14-1.i686.rpm (For UniProcessor)
ocfs-2.4.21-EL-smp-1.0.14-1.i686.rpm (For SMPs)
ocfs-tools-1.0.10-1.i386.rpm
ocfs-support-1.1.5-1.i386.rpm

[root@node2-pub root]# rpm -Uvh /rpms/ocfs-2.4.21-EL-1.0.14-1.i686.rpm \
>                               /rpms/ocfs-tools-1.0.10-1.i386.rpm \
>                               /rpms/ocfs-support-1.1.5-1.i386.rpm
Preparing...                ########################################### [100%]
   1:ocfs-support           ########################################### [ 33%]
   2:ocfs-2.4.21-EL         ########################################### [ 67%]
Linking OCFS module into the module path [ OK ]
   3:ocfs-tools             ########################################### [100%]
[root@node2-pub root]#
[root@node2-pub root]# cat /etc/ocfs.conf
#
# ocfs config
# Ensure this file exists in /etc
#

        node_name = node2-prv
        ip_address = 192.168.203.2
        ip_port = 7000
        comm_voting = 1
        guid = 238426EC6845F952C83A00065BAEAE7F
Loading OCFS Module:

[root@node2-pub root]# load_ocfs
/sbin/modprobe ocfs node_name=node2-pub ip_address=192.168.203.2 cs=1843 guid=238426EC6845F952C83A00065BAEAE7F ip_port=7000 comm_voting=1
modprobe: Can't locate module ocfs
load_ocfs: insmod failed
If you get the above error follow the below steps to fix this:

Verify that you have ocfs.o module under /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ directory.
[root@node2-pub root]# ls -l /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o lrwxrwxrwx 1 root root 38 Dec 19 23:14 /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o -> /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
IF THIS FILE EXISTS THEN:

open the /sbin/load_ocfs file using vi or another editor and change the below line as shown.
(Line Number 93)
# If you must hardcode an absolute module path for testing, do it HERE.
# MODULE=/path/to/test/module/ocfsX.o
Change to:
# If you must hardcode an absolute module path for testing, do it HERE.
MODULE=/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
IF THIS FILE DOES NOT EXISTS THEN:

Create a symbolic link as shown below.
mkdir /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/
ln -s /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
Now try again to load the same module

[root@node2-pub root]# load_ocfs
If you get the error again then modify the /sbin/load_ocfs file as shown in the above step after creating the symbolic link
[root@node2-pub root]# load_ocfs
/sbin/insmod /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o node_name=node2-prv ip_address=192.168.203.2 cs=1843 guid=238426EC6845F952C83A00065BAEAE7F ip_port=7000 comm_voting=1
Warning: kernel-module version mismatch
        /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o was compiled for kernel version 2.4.21-27.EL
        while this kernel is version 2.4.21-27.0.2.ELorafw1
Warning: loading /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o will taint the kernel: forced load
See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Module ocfs loaded, with warnings
You may get the above warning but this may be OK. Verify that the ocfs module is loaded by executing the below command.
[root@node2-pub root]# lsmod | grep ocfs
ocfs                  299104   0 (unused)
Creating and Mounting OCFS (Oracle Cluster File System):

Create the file system using mkfs:

Execute the below series of command from any one node to format the /dev/sda1 partition with the ocfs.
[root@node2-pub root]# mkfs.ocfs -F -b 128 -L /u02/oradata/ocr -m /u02/oradata/ocr -u 900 -g 901 -p 0755 /dev/sda1

Where   b= blocksize
        m= mountpoint
        u= UID of oracle user
        g=GID of oinstall group
        p=permission
Mounting OCFS (Oracle Cluster File System): (Do this on both the node)
[root@node2-pub root]# mount -t ocfs /dev/sda1 /u02/oradata/ocr
Add the below line into the /etc/fstab file to mount the ocfs automatically on every reboot of system.
/dev/sda1     /u02/oradata/ocr    ocfs   _netdev 0 0

[root@node2-pub root]# service ocfs start
Loading OCFS:                                              [ OK ]
[root@node2-pub root]# chkconfig ocfs on
Create the file system using "ocfstool" command line utility:

Please follow the GUI screenshots of creating and mounting ocfs file system. run the ocfstool from command as shown below:

Perform this step on Both the Nodes.

root@node1-pub root]# ocfstool
Check on the "Tasks" Button --> Select "Generate Config".

---> Select interface = eth1
port = 7000 and
Node Name = node1-prv (For node2, it would be node2-prv)

Confirm changes by looking into the /etc/ocfs.conf file. The contents of this file should be like this:
Ocfs Config file Contents

Now, Click on the "Tasks" button and select 'Format". You will see the screen like below. Select the appropriate value and click OK button. You need to perform this step only from one node.
Format partition

Now the /dev/sda1 is formatted with the ocfs. Now this is the time to mount this file system. Please perform this step on both the nodes. Click on the "Mount" button. You should see the
/dev/sda1 is mounted under /u02/oradata/ocr mountpoint. Also confirm that you see both the nodes in the "Configured Nodes" section.
Mounted filesystem

Add the below line into the /etc/fstab file on both the nodes to mount the ocfs automatically on every reboot.

/dev/sda1 /u02/oradata/ocr ocfs _netdev 0 0

Creating Automatic Storage Management (ASM) Disks for the Clustered Database (both the RAC Nodes):

I will show you how to create the asm disks (stamping the disks as asm) on the FireWire device. I am going to use one partition /dev/sda2 for ASM disks. I am going to use ASMLib IO for the ASM disks and for that I need oracleasm kernel drivers as well as binaries and support downloaded from Oracle's site. Please go to Creating and Configuring ASM instance and Database to get detailed information on how to create ASM instance and diskgroups and how to use it into the existing database OR new database. I downloaded the below rpms and installed them as root user on both the nodes.

[root@node2-pub rpms]# rpm -Uvh oracleasm-support-2.0.1-1.i386.rpm \
>                                oracleasm-2.4.21-27.0.2.ELorafw1-1.0.4-1.i686.rpm \
>                                oracleasmlib-2.0.1-1.i386.rpm
Preparing...                ########################################### [100%]
   1:oracleasm-support      ########################################### [ 33%]
   2:oracleasm-2.4.21-27.0.2########################################### [ 67%]
   3:oracleasmlib           ########################################### [100%]
[root@node2-pub rpms]#

Enter the following command to run oracleasm init script with configure option on both the nodes.

[root@node2-pub root]# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.

This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.

Default user to own the driver interface []: oracle
Default group to own the driver interface []: oinstall
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]:
Writing Oracle ASM library driver configuration:           [ OK ]
Creating /dev/oracleasm mount point:                       [ OK ]
Loading module "oracleasm":                                [ OK ]
Mounting ASMlib driver file system:                        [ OK ]
Scanning system for ASM disks:                             [ OK ]
[root@node2-pub root]#

ONLY on one NODE:

[root@node2-pub root]# /etc/init.d/oracleasm createdisk DSK1 /dev/sda2
Marking disk "/dev/sda2" as an ASM disk: [ OK ]
[root@node2-pub root]# /etc/init.d/oracleasm listdisks
DSK1

On the Remaining Nodes:

you only need to execute the below command to show these disks up there.

[root@node1-pub root]# /etc/init.d/oracleasm scandisks
[root@node1-pub root]# /etc/init.d/oracleasm listdisks
DSK1

Binding the partitions with the raw devices (Both the RAC Nodes):

Add the below lines into the /etc/sysconfig/rawdevices and restarted the rawdevices service (On both the nodes).

[root@node2-pub root]# cat /etc/sysconfig/rawdevices
# raw device bindings
# format: <rawdev> <major> <minor>
# <rawdev> <blockdev>
# example: /dev/raw/raw1 /dev/sda1
# /dev/raw/raw2 8 5

/dev/raw/raw2 /dev/sda2

[root@shree ~]# service rawdevices restart

Also, you need to change the ownership of these devices to oracle user.

[root@node2-pub root]# chown oracle.dba /dev/raw/raw2
[root@node2-pub root]# chmod 660 /dev/raw/raw2

Please add the below lines to the /etc/rc.local so that these are set back at reboot.

for i in `seq 2 2`
do
chown oracle.dba /dev/raw/raw$i
chmod 660 /dev/raw/raw$i
done

Checking the Configuration of the hangcheck-timer Module:

Before Installing Oracle Real Application Cluster, We need to verify that the hangcheck-timer module is loaded and configured correctly. The hangcheck-timer module monitors the Linux kernel for extended operating system hangs that could affect the reliability of a RAC node and cause a database corruption. If a hang occurs than the module restarts the node in seconds. There are hangcheck_tick and hangcheck_margin parameter that governs the behavior of the modules:

The hangcheck_tick parameter defines how often, in seconds, the hangcheck-timer check the node for hang. The default value is 60 seconds.
The hangcheck_margin parameter defines how long the hangcheck-timer waits, in seconds, for a response from Kernel. The Default value is 180 seconds.

If the Kernel fails to respond within the total of (hangcheck_tick + hangcheck_margin) seconds, the hangcheck-timer module restarts the system.

Verify that the hangcheck-timer module is running:

(1) Enter the below command on each node.

[root@node2-pub root]# lsmod | grep hangcheck-timer
hangcheck-timer 2648 0 (unused)

(2) If the module is not listed by the above command, then enter the below command to load the module on all the nodes.

[root@node2-pub root]# insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
(3) Also add the below line at the end of /etc/rc.local file to ensure that this module is loaded at every reboot.

insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180

Alternatively, you could also add the same into the /etc/modules.conf file as shown below.

options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180

This document explains the step by step process of installing Oracle 10g R2 (10.2.0.1) Clusterware Software using OUI.

Installing Oracle10g (10.2.0.1) Clusterware Software:

Task List:

Shutdown any running Oracle Processes
Determine Oracle Inventory Location
Setting Up oracle user Environment
Running OUI (oracle Universal Installer) to install 10g RAC Clusterware
Verify Virtual IP network config.
Verifying Oracle Clusterware Background Processes

Shut down any running Oracle processes:

If you are installing Oracle Clusterware on a node that already has a single-instance Oracle Database 10g installation, then stop the existing ASM instances. After Oracle Clusterware is installed, start up the ASM instances again. When you restart the single-instance Oracle database and then the ASM instances, the ASM instances use the Cluster Synchronization Services (CSSD) Daemon instead
of the daemon for the single-instance Oracle database. You can upgrade some or all nodes of an existing Cluster Ready Services installation. For example, if you have a six-node cluster, then
you can upgrade two nodes each in three upgrading sessions. Based the number of nodes that you upgrade in each session on the load the remaining nodes can handle. This is called a "rolling upgrade."

If a Global Services Daemon (GSD) from Oracle9i Release 9.2 or earlier is running, then stop it before installing Oracle Database 10g Oracle Clusterware by running the following command:

ORACLE_HOME/bin/gsdctl stop

Caution:

If you have an existing Oracle9i Release 2 (9.2) Oracle Cluster Manager (Oracle CM) installation, then do not shut down the Oracle CM service. Doing so prevents the Oracle Clusterware 10g Release 2 (10.2) software from detecting the Oracle9i Release 2 node list, and causes failure of the Oracle Clusterware installation.

Determine the Oracle Inventory Location:

If you have already installed Oracle software on your system, then OUI detects the existing Oracle Inventory directory from the /etc/oraInst.loc file, and uses this location.

If you are installing Oracle software for the first time on your system, and your system does not have an Oracle inventory, then you are asked to provide a path for the Oracle inventory, and you are also asked the name of the Oracle Inventory group (typically, oinstall).

Setting Up Oracle Environment:

Add the below lines into the .bash_profile under the oracle home directory (usually /home/oracle) to set the ORACLE_BASE and ORACLE_SID set in the session.
export ORACLE_BASE=/u01/app/oracle

Running OUI (Oracle Universal Installer) to install Oracle Clusterware:

Complete the following steps to install Oracle Clusterware on your cluster. You need to run the runInstaller from ONLY ONE node (any single node in the cluster).

Start the runInstaller command as oracle user from any one node When OUI displays the Welcome page, click Next

Xlib: connection to ":0.0" refused by server
Xlib: No protocol specified

Can't connect to X11 window server using ':0.0' as the value of the DISPLAY variable.

If you get the above error, please execute the below command as root and then start the runInstaller by connecting as oracle.

[root@node1-pub root]# xhost +
access control disabled, clients can connect from any host
[root@node1-pub root]# su - oracle
[oracle@node1-pub oracle]$ /mnt/cdrom/runInstaller

CLICK Next

CLICK Next
Clusterware Home Directory selection

CLICK Next
Pre-requisite Check

At this step, you should not receive any error. If you have configured the Pre-Installation steps correctly, then you will not get any errors. I get one warning here as you can see which
is complaining about the low memory than required. I had only 512 MB ram and the required memory is 1GB but I would not worry about this warning and will check the status box.

CLICK Next
clstwr_04

CLICK Next
Network Insterface

Check whether the interface has correct subnet mask and type associated to it. If you have configured the network for all the nodes correctly as explained in Pre-Installation task, then you would not get any error message at this step.

CLICK Next
OCR File Location

Enter the filename and location (mount point) for the OCR file. In the Pre-Installation steps, I have configured ocfs for this file to store. I have used the same mount point (/u02/oradata/ocr) to store them. I have chosen the External redundancy just for experiment purpose. On production server, You make sure that you have one extra mountpoint created on separate physical device to store the mirror file to avoid SPF (Single Point Of Failure)

CLICK Next
Voting Disk file Location

Use the same mount point as OCR file and enter the filename you want for Voting Disk file. If you choose the External Redundancy, then you need to mention only one location.

CLICK Next
Clusterware Software Installation Summary

CLICK Next
Installation screen

At this step, You may get the error message complaining about the timestamp mismatch among the nodes. Please make sure that all the nodes have timestamp as close as possible. (Try to make hh24:mi level match).
Execute Configuration scripts

When you execute the above scripts on all the nodes, you should get the below output.

NOTE: The node from which you are installing the Clusterware software is the node that will be registered as node 1 in the Cluster registry. Here, I have installed the clusterware from machine called "node2-pub" and so it became node 1 to the Clusterware. This will not change the behavior of RAC.

[root@node1-pub root]# /u01/app/oracle/oracle/product/10.2.0/crs/root.sh
WARNING: directory '/u01/app/oracle/oracle/product/10.2.0' is not owned by root
WARNING: directory '/u01/app/oracle/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle/oracle' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
Checking to see if Oracle CRS stack is already configured
/etc/oracle does not exist. Creating it now.

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/u01/app/oracle/oracle/product/10.2.0' is not owned by root
WARNING: directory '/u01/app/oracle/oracle/product' is not owned by root
WARNING: directory '/u01/app/oracle/oracle' is not owned by root
WARNING: directory '/u01/app/oracle' is not owned by root
WARNING: directory '/u01/app' is not owned by root
WARNING: directory '/u01' is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
assigning default hostname node1-pub for node 1.
assigning default hostname node2-pub for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node <nodenumber>: <nodename> <private interconnect name> <hostname>
node 1: node1-pub node1-prv node1-pub
node 2: node2-pub node2-prv node2-pub
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
        node1-pub
        node2-pub
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps

Creating VIP application resource on (2) nodes...
Creating GSD application resource on (2) nodes...
Creating ONS application resource on (2) nodes...
Starting VIP application resource on (2) nodes...
Starting GSD application resource on (2) nodes...
Starting ONS application resource on (2) nodes...

Done.
[root@node1-pub root]#
CLICK OK Button
End Of Installation Screen

CLICK Exit

Verifying Virtual IP Network config:

Now, verify that the virtual IP is configured on eth0 by executing the below command.
Checking Virtual IP configuration

Verifying Oracle Clusterware Background Processes:

The following processes must be running in your environment after the Oracle Clusterware installation for Oracle Clusterware to function:

evmd: Event manager daemon that starts the racgevt process to manage callouts.

ocssd: Manages cluster node membership and runs as oracle user; failure of this process results in node restart.

crsd: Performs high availability recovery and management operations such as maintaining the OCR. Also manages application resources and runs as root user and restarts automatically upon failure.

This document explains the step by step process of installing Oracle 10g Real Application Cluster (RAC) with OUI.

     Installing Oracle 10g R2 (10.2.0.1) Real Application Cluster (RAC) Software 32-bit on RHEL 3 / CentOS 3:
You MUST install this software from one node only. I usually prefer to install the Oracle software without creating the starter database.  So, here I have selected the "Software Only" option.

Start the runInstaller command as oracle user from any one node When OUI displays the Welcome page, click Next
[oracle@node2-pub oracle]$ /mnt/cdrom/runInstaller

     Xlib: connection to ":0.0" refused by server
     Xlib: No protocol specified

     Can't connect to X11 window server using ':0.0' as the value of the DISPLAY variable.
        If you get the above error, please execute the below command as root and then start the runInstaller by connecting as oracle.

[root@node2-pub root]# xhost +
access control disabled, clients can connect from any host
[root@node2-pub root]# su - oracle
[oracle@node2-pub oracle]$ /mnt/cdrom/runInstaller

Click Next

Select "Custom" and Click Next.

Click Next

Select all the nodes and Click Next

Click Next

At this step; you should not receive any error. If you have configured the Pre-Installation steps correctly, then you will not get any errors. I get one warning here as you can see which
is complaining about the low memory than required. I had only 512 MB ram and the required memory is 1GB but I would not worry about this warning and will check the status box.

Click Next

Leave the Default values (dba, dba) and Click Next

Select "Install database Software Only" and Click Next

Click Install

Execute the mentioned script on all the nodes.

Click Exit

At this time, you need to update the .bash_profile file with the ORACLE_HOME and PATH value as shown below. This file needs to be updated on all the nodes

[oracle@node2-pub oracle]$ cat .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH
unset USERNAME
export ORACLE_BASE=/u01/app/oracle
export ORACLE_SID=RacDB2 # For Node Node1-pub -- RacDB1
export ORACLE_HOME=$ORACLE_BASE/oracle/product/10.2.0/db_1
export PATH=$PATH:$ORACLE_HOME/bin
export LD_LIBRARY_PATH=$ORACLE_HOME/lib

This document explains the step by step process of Creating Oracle 10g Real Application Cluster (RAC) database using ASM storage option. In the previous steps, I have installed the Oracle 10g software as well as stamped the physical devices /dev/sda2 and /dev/sda3 as ASM disks which I have used for storing database files.

Creating Oracle 10g (10.2.0.1) Real Application Cluster (RAC) Database:

Task List:

        Creating and Configuring Clustered ASM instance and Diskgroups
        Creating Clustered database

[oracle@node2-pub oracle]$ dbca

Xlib: connection to ":0.0" refused by server
lib: No protocol specified

Can't connect to X11 window server using ':0.0' as the value of the DISPLAY variable.
If you get the above error, please execute the below command as root and then start the dbca by connecting as oracle.
[root@node2-pub root]# xhost +
access control disabled, clients can connect from any host
[root@node2-pub root]# su - oracle
[oracle@node2-pub oracle]$ dbca

Now follow the steps below:

If you do not see this screen after running "dbca", then you need to make sure that ocssd, crsd and evmd services running under all the nodes. Please goto Installing 10g R2 Clusterware Softwarefor more information on configuring clusterware.

Select "Oracle Real Application Cluster database" option and Click Next.

Creating and Configuring Clustered ASM instances and diskgroups:
For detailed information on creating ASM Instance and Diskgroups, please go to Step By Step Instructions on Creating and Configuring Automatic Storage Management (ASM) Instance ands Disk Groups using Unix IO as well as ASMLib IO.

I selected "Configure Automatic Storage Management" option to create ASM instance and diskgroup where the Clustered database files will be stored in.
Select ASM

Select All the nodes in the cluster and Click Next.

At this step, enter the Sys password for the ASM instance. Also enter the location of the spfile of init file of the AS instance. If you select to go for spfile option, then it has to be created on the shared device and must be accessed by all the nodes in the RAC. I have used the same mount point as OCR file and VD files are stored. It is /u02/oradata/ocr and is formatted with ocfs.

At this time, you may get the below message stating that no RAC listener found. You may go back and configure the RAC listener using "netca" OR you can simply select "Yes" and OUI will create one for us. For the experimental purpose, I selected "Yes". But for my Production Database at client site, I had configured the listener with the required properties and security using netca before I reached this step

this step will create an ASM instance for us.

Now the ASM instance has been created but there are no diskgroups created under this instance. So, the below step allow you to create one for us. Unfortunately, you do not see any physical device in the below pan. Do not worry; simply Click the "Create New" button to create one.

You need to change the disk discovery path to ORCL:* as shown below.

I selected the ORCL:DSK1 disk with External Redundancy. As all the ASM disks (DSK1 --> /dev/sda2 and DSK2 --> /dev/sda3) are on the same physical disk (/dev/sda), I have not selected normal redundancy. Also, This is an experimental RAC, I would not worry about SPF (Single Point Of Failure). For my production database, I had separate physical devices available for Normal redundancy set up. Enter the Disk Group Name (e.g., RAC_GRP) and click Next.

You will see the below screen which is creating ASM diskgroup.

Select the newly created disk group. The state must be "Mounted (n/n)" for n node cluster. For our case of 2 node cluster, it is MOUNTED (2/2).

Select No and start exit the Installer.

Creating RAC Database on ASM:

Select the RAC option and Click Next.

Select "Create a Database" and click Next.

Select all the nodes on which new database will be shared among.

Choose any one of these options. I selected general Purpose database option.

Enter the Global Database name and SID prefix.

Select Database control for database management and click next.

Enter the username and password for the SYS schema for the new database.

Select ASM as a Storage mechanism.

Select the Disk group where the database files will be stored in. e.g., RAC_GRP.

Specify the Flash recovery area.

Create the new service for the RacDB database by clicking "Add" button. Enter the name you like. I entered RacDB_srvc as service name. you would see two instance name in the right pan as shown below. Click Next.

Click OK.

Click Next.

Click Next.

Make sure that all the datafiles, controlfiles and redo log files have right locations assigned.

Click Finish

At this time, the database RacDB has already been opened on node1-pub machine under RacDB1 Instance. Copy the initRacDB1.ora file from node1-pub to node2-pub in the $ORACLE_HOME/dbs directory (if not existed). Change the contents on this file to one as shown below and start the database using "startup" command from sqlplus. This command will start the RacDB2 instance and get the location of UNDOTBS2 datafile and redologs thread 2 (mounting the database) member and then opens them (opening the database). The rest of the datafiles are shared by these two
instances.