Mike Conigliaro

Notes on setting up a GFS cluster on Redhat ES 3

These are my notes on installing and configuring GFS from source RPMs. I haven’t worked with GFS in over a year, so I don’t know if this information is still accurate, but I’m posting it anyway in the hope that someone out there will find it useful. When following these instructions, your best bet is to run each command on each node before moving on to the next step (unless otherwise specified, or unless you know what you’re doing).

1.) Get the GFS and perl-Net-Telnet SRPMs from Redhat.

ftp://ftp.redhat.com/pub/redhat/linux/enterprise/3/en/RHGFS/i386/SRPMS/ ftp://ftp.redhat.com/pub/redhat/linux/updates/enterprise/3ES/en/RHGFS/SRPMS/

2.) Install the perl-Digest-HMAC and perl-Digest-SHA1 RPMs.

3.) Build and install the perl-Net-Telnet SRPM.

rpmbuild --rebuild perl-Net-Telnet-3.03-2.src.rpm
rpm -Uvh /usr/src/redhat/RPMS/noarch/perl-Net-Telnet-3.03-2.noarch.rpm

4.) Each GFS node needs be running clock synchronization software to prevent unnecessary inode timestamp updates (which according to the manual, will impact performance severely), so you need to download and install the NTP RPM.

rpm -Uvh ntp-4.1.2-4.EL3.1.i386.rpm

5.) Sync up your clock for the first time:

ntpdate 10.25.1.36

6.) Add the following lines to /etc/ntp.conf.

restrict pool.ntp.org mask 255.255.255.255 nomodify notrap noquery
server pool.ntp.org

7.) Start ntpd.

/etc/init.d/ntpd start

8.) Verify that ntpd is syncing with your NTP server(s). When you do this, you need to make sure that your jitter values are in the lower single digits. They should definately not be 4000 (which means that NTP is not working at all).

ntpq -p

9.) Make sure you have the kernel, kernel-smp, and kernel-source RPMs installed.

rpm -q kernel kernel-smp kernel-source

10.) Install the GFS SRPM.

rpm -Uvh GFS-6.0.2-25.src.rpm

11.) Check your current kernel version.

uname -a

12.) Open /usr/src/redhat/SPECS/gfs-build.spec and look for a line that starts with %define KERNEL_EXTRAVERSION. You may need to change this to match the “extraversion” of your kernel. You should also look for the line that says %define buildhugemem 1 and set it to 0 (unless you have a machine with >16gb memory with the hugemem kernel installed).

13.) Create a new SRPM with all the changes you made.

rpmbuild -bs /usr/src/redhat/SPECS/gfs-build.spec

14.) Build the GFS RPMs. Don’t forget to use the –target i686 option, or the SMP modules will not be installed.

rpmbuild --rebuild --target i686 /usr/src/redhat/SRPMS/GFS-6.0.2-25.src.rpm

15.) Install the GFS RPMs.

rpm -Uvh /usr/src/redhat/RPMS/i686/*6.0.2-25.i686.rpm

16.) Try manually loading the GFS modules into the kernel. If the modules are loaded succesfully, you should see them (along with all the other loaded kernel modules) in the output of lsmod.

depmod -a
modprobe pool
modprobe lock_gulm
modprobe gfs
lsmod

17.) At this point, the clustering software is installed, and simply needs to be configured. Now you need to create three config files (cluster.ccs, fence.ccs, and nodes.ccs) for the cluster configuration system (CCS). These files should be placed in a temporary directory by themselves on one node (I used /root/cluster). This is a fairly straightforward process, so I won’t repeat what chapter 6 of the Redhat GFS Administrators Guide already covers in detail.

18.) Once the CCS files have been created on one of the nodes, you should probably run a syntax check on them.

ccs_tool test /root/cluster

19.) Next, you need to create a “cluster configuration archive” (CCA) from the ccs files, and write it to a “cluster configuration archive device” (which is just a fancy name for a partition that all nodes have access to). A pool volume can be used for this, but I had the luxury of a 2.5TB iSCSI storage array, so just created a 2MB partition on that. Use the ccs_tool command to create the archive on the storage device of your choice. Note that ccs_tool writes these files in its own raw format, so there’s no need to format the partition. Also note that I was unable to create new CCS archives without having valid DNS records for all nodes.

ccs_tool create /root/cluster /dev/iscsi/bus0/target0/lun0/part1

20.) Tell Redhat’s init scripts where to find the ccs archive.

echo "CCS_ARCHIVE=\"/dev/iscsi/bus0/target0/lun0/part1\"" >/etc/sysconfig/gfs

21.) Now start the ccs daemons.

service ccsd start

22.) Start the lock_gulm server daemons.

service lock_gulmd start

23.) Create the GFS filesystems from one node.

gfs_mkfs -p lock_gulm -t Cluster1:gfs1 -j 8 /dev/iscsi/bus0/target0/lun0/part2

24.) Add your GFS filesystems to /etc/fstab with a fstype of gfs.

25.) Mount your GFS filesystems. Note that there is a known bug in some versions of GFS (related to mounting shared volumes) where node hostnames must be unique in the first 8 characters.

service gfs start

26.) Congratulations, you have a cluster! At this point, you should test moving files from individual nodes to the shared volume. All other nodes in the cluster should immediately be able to see these files. You might also try comparing the md5 sum of the file before it was moved to the md5 sum of the file after it was moved, just to make sure nothing weird is going on.

From node 1:

md5sum file.tgz
cp file.tgz /mnt/volume

From node 2:

md5sum /mnt/volume/file.tgz

Thats it! For more info, RTFM! ;-)

Caveats

1.) I encountered a problem due to network latency in which iSCSI sessions were not consistantly being established before the GFS scripts tried to access the volumes. The ultimate solution was to first disable the init scripts…

chkconfig --level 0123456 iscsi off
chkconfig --level 0123456 ccsd off
chkconfig --level 0123456 lock_gulmd off
chkconfig --level 0123456 gfs off

…then add the following to /etc/rc.local.

sleep 45
service iscsi start
service ccsd start
service lock_gulmd start
service gfs start

2.) The fence_apc fencing method does not officially support the APC switch I was using, but I came up with a workaround (which can be found on the bug report I submitted to Redhat. This workaround was successful on the fence_apc script from version 6.0.0-1.2, but not with the one from 6.0.2-25. When upgrading, I needed to copy over the old fence_apc script (on the master lock server only):

cp /usr/src/redhat/SOURCES/gfs-build/bedrock/fence/agents/apc/fence_apc.pl /sbin/fence_apc
blog comments powered by Disqus