Install RH6 with minimal configuration on two VMs (Look into HOWTO align VMware Linux VMDK files ). Add (vmdk) disk to every node for application. Mine configuration looks as follow (on both nodes):
/dev/sda 128m -> First partition /dev/sda1 used for /boot /dev/sdb 8g -> Whole disk used as PV for rootvg /dev/sdc 30g -> Whole disk used as PV for GFS2 shared file system
Build one /etc/hosts with all relevant to cluster IPs and names and put it on both nodes.
Copy host SSH keys from one node to another:
vorh6t01 # scp vorh6t02:/etc/ssh/ssh_host_\* /etc/ssh/ ... vorh6t01 # service sshd restart
Generate root SSH keys and exchange it over cluster nodes:
vorh6t01 # ssh-keygen -t rsa -b 1024 -C "root@vorh6t" ..... vorh6t01 # cat .ssh/id_rsa.pub >> .ssh/authorized_keys vorh6t01 # scp -pr .ssh vorh6t02:
DRBD (Distributed Replicated Block Device) will convert our both /dev/sdc disks, dedicated to each node, to behaive like shared storage. This will make our VMs becomes storage independant and allows RH cluster work.
There still no binary distribution for RH6, however you can purchase it with support from author LINBIT. AND you still able to compile it from source (thanks to GPL)
# yum install make gcc kernel-devel flex rpm-build libxslt # cd /tmp && wget -q -O - http://oss.linbit.com/drbd/8.4/drbd-8.4.4.tar.gz | tar zxvf - # cd drbd-8.4.4/ # ./configure --with-utils --with-km --with-udev --with-rgmanager --with-bashcompletion \ --prefix=/usr --localstatedir=/var --sysconfdir=/etc # make # make install
Note:: You have to recompile kernel module every time you upgrade kernel
# make module
DRBD will not be a part of cluster in this configuration. It will only supply infrastructure for GFS2 cluster running over it. So it will work with raw disk, configured to work as active-active and provides raw block device, simulating shared storage disk.
You can put everything in /etc/drbd.conf, however recommended by LINBIT practice to separate common and resources configuration by include directive:
# cat /etc/drbd.conf # You can find an example in /usr/share/doc/drbd.../drbd.conf.example include "drbd.d/global_common.conf"; include "drbd.d/*.res";
Copy global_common.conf from distribution to /etc/drbd.d and edit it to fix your needs.
# cat /etc/drbd.d/global_common.conf global { usage-count no; } common { handlers { } startup { wfc-timeout 300; degr-wfc-timeout 0; become-primary-on both; } options { } disk { } net { protocol C; cram-hmac-alg sha1; shared-secret "9szdFmSkQEoXU1s7UNVbpqYrhhIsGjhQ4MxzNeotPku3NkJEq3LovZcHB2pITRy"; use-rle yes; allow-two-primaries yes; } }
Some security is not a bad idea, use "shared-secret".
# cat /etc/drbd.d/export.res resource export { device /dev/drbd1; disk /dev/sdc; meta-disk internal; disk { resync-rate 40M; fencing resource-and-stonith; } net { csums-alg sha1; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } handlers { fence-peer "/usr/lib/drbd/rhcs_fence"; } on vorh6t01.domain.com { address 10.10.10.240:7789; } on vorh6t02.domain.com { address 10.10.10.241:7789; } }
I've added dedicated 10.10.10/24 LAN NIC to both VM for replication use only.
Disk was named here as /dev/sdc, this name is not so persistant between reboots, you can use any other (more persistant) references you can find in /dev/disk/by-{id,label,path,uuid}.
Replicate configuration to second node:
root@vorh6t01:~ # scp -pr /etc/drbd.* root@vorh6t02:/etc/
Initialize DRBD:
root@vorh6t01:~ # drbdadm create-md export ... root@vorh6t02:~ # drbdadm create-md export ... root@vorh6t01:~ # drbdadm up export root@vorh6t02:~ # drbdadm up export # cat /proc/drbd version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@vorh6t01.domain.com, 2014-03-18 12:05:58 1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:31456284
As you can see, it is in Connected state, both sides marked as Secondary and Inconsistent
Let's help DRBD to take decision:
root@vorh6t01:~ # drbdadm primary --force export root@vorh6t01:~ # cat /proc/drbd version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@vorh6t01.domain.com, 2014-03-18 12:05:58 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:2169856 nr:0 dw:0 dr:2170520 al:0 bm:132 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:27475996 [>...................] sync'ed: 7.4% (26832/28948)M finish: 0:11:03 speed: 41,416 (27,464) K/sec
OK, vorh6t01 becomes Primary and UpToDate and synchronization beguns.
Wait for initial syncronization finished and tell second node becomes primary too:
root@vorh6t02:~ # cat /proc/drbd version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@vorh6t02.domain.com, 2014-09-23 07:12:46 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:31456284 al:0 bm:1920 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 root@vorh6t02:~ # drbdadm primary export root@vorh6t02:~ # cat /proc/drbd version: 8.4.4 (api:1/proto:86-101) GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@vorh6t02.domain.com, 2014-09-23 07:12:46 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:31456948 al:0 bm:1920 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
Fix checkconfig line of /etc/init.d/drbd script on both nodes. Also remove any hint lines between ### BEGIN INIT INFO and ### END INIT INFO. This fix will adjust drbd start/stop to correct place (as for RH6), between network and clvmd.
... # chkconfig: 2345 23 77 ... ### BEGIN INIT INFO # Provides: drbd ### END INIT INFO ...
Make DRBD starting at boot time on both nodes:
# chkconfig --add drbd # chkconfig drbd on
Install these RPMs on both nodes (with all depencies):
# yum install lvm2-cluster ccs cman rgmanager gfs2-utils
vorh6t01 and vorh6t02 are two nodes of cluser named vorh6t. Take care to make all names resolvable by DNS and add all names to /etc/hosts on both nodes.
Define cluster:
# ccs_tool create -2 vorh6t
The command above create /etc/cluster/cluster.conf file. It can be editted by hand and have to be redistributed to every node in cluster. -2 option required for two-node cluster; usual configuration suppose more than two nodes, to make quorum clear.
Open file and change nodenames to real names. The resulting file should be like:
<?xml version="1.0"?> <cluster name="vorh6t" config_version="1"> <cman two_node="1" expected_votes="1" transport="udpu" /> <clusternodes> <clusternode name="vorh6t01.domain.com" votes="1" nodeid="1"> <fence> <method name="single"> </method> </fence> </clusternode> <clusternode name="vorh6t02.domain.com" votes="1" nodeid="2"> <fence> <method name="single"> </method> </fence> </clusternode> </clusternodes> <fencedevices> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster>
I am using transport="udpu" here, because my network does not support multicasts and broadcasts are not welcomed too. Without this option, my cluster works upredictable. Check:
# ccs_tool lsnode Cluster name: vorh6t, config_version: 1 Nodename Votes Nodeid Fencetype vorh6t01.domain.com 1 1 vorh6t02.domain.com 1 2 # ccs_tool lsfence Name Agent
Copy /etc/cluster/cluster.conf to second node:
vorh6t01 # scp /etc/cluster/cluster.conf vorh6t02:/etc/cluster/cluster.conf
You can start sluster services now to see it working. Start it by /etc/init.d/cman start on both nodes. Check /var/log/messages. See clustat output:
vorh6t01 # clustat Cluster Status for vorh6t @ Thu Sep 27 15:04:58 2012 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ vorh6t01.domain.com 1 Online, Local vorh6t02.domain.com 2 Online vorh6t02 # clustat Cluster Status for vorh6t @ Thu Sep 27 15:05:07 2012 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ vorh6t01.domain.com 1 Online vorh6t02.domain.com 2 Online, Local
Add cluster services to init scripts. Start cluster and resource manager on both nodes:
# chkconfig --add cman # chkconfig cman on
This cluster will not manage any resources, just provides infrastructure for shared clustered file system, therefore no additional configuration required. Probably fencing should be added.
Enable cluster featires on both nodes and start clvmd:
# lvmconf --enable-cluster # /etc/init.d/clvmd start # chkconfig --add clvmd # chkconfig clvmd on
Fix filter line in /etc/lvm/lvm.conf to expicitly include drbd device and exclude others. Would LVM locked underlying device, drbd will not start, therefore brain split will occure. Here is an example of my "filter" line, adding only "rootvg" device and drbd device:
filter = [ "a|^/dev/drbd|", "a|^/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:0:0|", "r/.*/" ]
As recommended by DRBD documentation, disable LVM write cache on both nodes fixing /etc/lvm/lvm.conf:
... write_cache_state = 0 ...and drop stale cache:
# rm -f /etc/lvm/cache/.cache
Create PV and CLV on one node. Pay attention to use our /dev/drbd1 device, not underlaying /dev/sdc:
root@vorh6t01:~ # pvcreate --dataalignment 4k /dev/drbd1 Physical volume "/dev/drbd1" successfully created root@vorh6t01:~ # vgcreate exportvg /dev/drbd1 Clustered volume group "exportvg" successfully created root@vorh6t01:~ # lvcreate -n export -l100%FREE /dev/exportvg Logical volume "export" created
Check by commands pvs, vgs and lvs that everything exist on second node too.
Create GFS2 on one node as following:
vorh6t01:~ # mkfs.gfs2 -p lock_dlm -t vorh6t:export -j 2 /dev/exportvg/export This will destroy any data on /dev/exportvg/export. It appears to contain: symbolic link to `../dm-6' Are you sure you want to proceed? [y/n] y Device: /dev/exportvg/export Blocksize: 4096 Device Size 30.00 GB (7863296 blocks) Filesystem Size: 30.00 GB (7863294 blocks) Journals: 2 Resource Groups: 120 Locking Protocol: "lock_dlm" Lock Table: "vorh6t:export" UUID: 43b39c8b-cb8b-f7d7-c35d-91a909bc3ade
where: vorh6t is ClusterName, export is FS name, -j 2 using two journals as we have two nodes.
Then, mount it:
# mkdir /export # mount -o noatime,nodiratime -t gfs2 /dev/exportvg/export /export # echo "/dev/exportvg/export /export gfs2 noatime,nodiratime 0 0" >> /etc/fstab # chkconfig --add gfs2 ; chkconfig gfs2 on
/etc/init.d/gfs2 script as part of gfs2-utils will mount/umount GFS2 from /etc/fstab at appropriate time, after cluster started and before it goes down.
Make reboots, check if "/export" mounted after reboot. In case not, repeat checks if you have correct line in /etc/fstab, if you have correct "filter" in /etc/lvm/lvm.conf, if you fixed /etc/init.d/drbd to start/stop between network and clvmd (check for real numbers in /etc/rc.d/rc{1,3}.d). All of these were described above.
Prerequisites:
# yum install openssl-devel
Install VI Perl Toolkit on both nodes ; Somtime VmWare call it vSpher SDK, CLI or whatever. It should install /usr/lib/vmware-vcli/apps/ and other tools in /usr/bin. Package that was called "VMware-vSphere-Perl-SDK-5.5.0*" was OK for me.
Fix /etc/cluster/cluster.conf
<?xml version="1.0"?> <cluster name="vorh6t" config_version="3"> <cman two_node="1" expected_votes="1" transport="udpu" /> <clusternodes> <clusternode name="vorh6t01.domain.com" votes="1" nodeid="1"> <fence> <method name="single"> <device name="vmware" port="vorh6t01" /> </method> </fence> </clusternode> <clusternode name="vorh6t02.domain.com" votes="1" nodeid="2"> <fence> <method name="single"> <device name="vmware" port="vorh6t02" /> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice name="vmware" agent="fence_vmware" ipaddr="VCNAME" action="off" login="VCUSER" passwd="PASSWORD" /> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster>
port is name of VM on VC, ipaddr is name or IP of VC
Copy to neighbour and propagate changes:
vorh6t01:~ # scp /etc/cluster/cluster.conf vorh6t02:/etc/cluster/cluster.conf vorh6t01:~ # cman_tool version -r -S
Brain split may occure during playing with cluster untill it configured perfect.
Let's assume 02 node data is not important and will be dropped:
vorh6t02:~ # umount /export vorh6t01:~ # umount /export vorh6t02:~ # vgchange -a n exportvg vorh6t02:~ # drbdadm secondary export vorh6t02:~ # drbdadm connect --discard-my-data export vorh6t01:~ # drbdadm connect export vorh6t01:~ # cat /proc/drbd vorh6t02:~ # drbdadm primary export vorh6t02:~ # vgchange -ay exportvg vorh6t02:~ # mount /export vorh6t01:~ # mount /export