DRBD Resource Settings for SLE15 HA Cluster

This article was intended to be part of the HA NFS server on SLE15 solution, so please refer to that for any missing information. This article will only cover the DRBD resource setup, and may be useful on its own.

Configuring DRBD before clustering

DRBD is a pseudo devmapper device based on replication between local disks. This solution can be very well suited for a stretch cluster that cannot have a real shared disk. Please note that the stretch cluster must use a witness device (recommended Qdevice/Qnetd).

The DRBD software is already installed as part of the ha_sles template. No further action is required.

Create identical local disks on both nodes to use by DRBD:

# pvs
  PV         VG     Fmt  Attr PSize  PFree 
  /dev/vda1  rootvg lvm2 a--  60.00g 48.00g
# lvcreate -L10g -n drbd rootvg
  Logical volume "drbd" created.

Create a DRBD resource file /etc/drbd.d/nfsvol.res. The DRBD software also uses the term "resource" just like the cluster, so don't get confused.

resource nfsvol {
   disk {
      on-io-error       pass_on;
      resync-rate       100M;
   }
   net {
      protocol  C;
      cram-hmac-alg   sha1;
      csums-alg sha1;
      shared-secret   "9szdFmSkQEoXU1s7UNVbpqYrhhIsGjhQ4MxzNeotPku3NkJEq3LovZcHB2pITRy";
      fencing resource-and-stonith;
   }
   handlers {
      fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh";
      after-resync-target "/usr/lib/drbd/crm-unfence-peer.9.sh";
   }
   startup {
      wfc-timeout 100;
      degr-wfc-timeout 120;
   }
   connection-mesh {
      hosts     nfs1 nfs2;
   }
   on nfs1 {
      address   192.168.120.11:7788;
      device    drbd0;
      disk      /dev/rootvg/drbd;
      meta-disk internal;
      node-id   0;
   }
   on nfs2 {
      address   192.168.120.12:7788;
      device    drbd0;
      disk      /dev/rootvg/drbd;
      meta-disk internal;
      node-id   1;
   }
}

The bolded options are recommended by SUSE for cluster integration. The options above them enable authentication and encryption. Replace the share-secret with something more unique.

Copy the file to another node, possibly using csync2.

# csync2 -xv
Marking file as dirty: /etc/drbd.d/nfsvol.res
Connecting to host nfs2 (SSL) ...
Connect to 192.168.120.12:30865 (nfs2).
Updating /etc/drbd.d/nfsvol.res on nfs2 ...

After creating the configuration files, you need to initialize the device itself.

root@nfs1:~ # drbdadm create-md nfsvol 
initializing activity log
initializing bitmap (320 KB) to all zero
Writing meta data...
New drbd meta data block successfully created.
root@nfs1:~ # drbdadm up nfsvol 
root@nfs1:~ # drbdadm status nfsvol 
nfsvol role:Secondary
  disk:Inconsistent
  nfs2 connection:Connecting

Continue initializing the peer device on the second node:

root@nfs2:~ # drbdadm create-md nfsvol 
initializing activity log
initializing bitmap (320 KB) to all zero
Writing meta data...
New drbd meta data block successfully created.
root@nfs2:~ # drbdadm up nfsvol 
root@nfs2:~ # drbdadm status nfsvol 
nfsvol role:Secondary
  disk:Inconsistent
  nfs1 role:Secondary
    peer-disk:Inconsistent

According to the status, both devices are connected but do not have a primary copy. Declare the first node as the primary.

root@nfs1:~ # drbdadm primary --force nfsvol 
root@nfs1:~ # drbdadm status nfsvol 
nfsvol role:Primary
  disk:UpToDate
  nfs2 role:Secondary
    replication:SyncSource peer-disk:Inconsistent done:2.96

The DRBD device will be managed by the cluster software, so now we need to stop the device once it has completed its initialization.

root@nfs1:~ # drbdadm status nfsvol 
nfsvol role:Primary
  disk:UpToDate
  nfs2 role:Secondary
    peer-disk:UpToDate

root@nfs1:~ # drbdadm down nfsvol 
root@nfs1:~ # drbdadm status nfsvol 
# No currently configured DRBD found.
nfsvol: No such resource
Command 'drbdsetup status nfsvol' terminated with exit code 10
root@nfs1:~ # rm -f /var/lock/drbd-147-0

Also on the second node:

root@nfs2:~ # drbdadm status nfsvol 
nfsvol role:Secondary
  disk:UpToDate
  nfs1 connection:Connecting

root@nfs2:~ # drbdadm down nfsvol 
root@nfs2:~ # drbdadm status nfsvol 
# No currently configured DRBD found.
nfsvol: No such resource
Command 'drbdsetup status nfsvol' terminated with exit code 10
root@nfs2:~ # rm -f /var/lock/drbd-147-0

Creating a DRBD resource in a cluster

Create the file drbd.txt with the following content:

# Define DRBD resource
primitive p-drbd ocf:linbit:drbd \
    params drbd_resource="nfsvol" drbdconf="/etc/drbd.conf" \
    op monitor interval="29s" role="Master" \
    op monitor interval="31s" role="Slave"
# Start it as master/slave copyes on both nodes
ms ms-drbd p-drbd \
    meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true

Apply changes to cluster.

# crm configure load update drbd.txt

Tightening DRBD to other services

If you are using this article as part of the HA NFS server on SLE15 solution, you should continue to configure the VG, LV, and filesystems on the primary DRBD node. Once complete, you should add the ordering and colocations tightening to the resources.txt file used in the parent article.

# Promote DRBD before starting NFS group
order o-drbd_before_g-nfs Mandatory: ms-drbd:promote g-nfs:start
# NFS group has to run on Master DRBD only
colocation c-g-nfs_on_drbd inf: g-nfs ms-drbd:Master

Updated on Fri Oct 4 16:32:48 IDT 2024 More documentations here