eskp.net


NFS storage cluster with DRBD, Corosync and Pacemaker at Linode

At Linode create 2 instances and call them ha-1 and ha-2

Deployment Disk Size: 4320MB

Settings tab

  • Linode Label: ha-1
  • Display Group: HA pair

Dashboard tab

Create a new Disk Image

  • Label: nfs
  • Type: unformatted / raw
  • Size: max available

Edit Configuration Profile

  • Assign Block Device for "nfs" on /dev/xvdc

Remote Access tab

  • In IP Failover select all of the addresses

Boot

Set the hostname

echo "ha-1" > /etc/hostname
hostname -F /etc/hostname

ssh root@ha-2 "echo \"ha-2\" > /etc/hostname"
ssh root@ha-2 "hostname -F /etc/hostname"

Set /etc/hosts

192.168.x.x ha-1
192.168.y.y ha-2

Generate SSH keys and distribute between the two servers

ssh-keygen -t rsa
scp .ssh/id_rsa.pub root@ha-2:
ssh root@ha-2 "ssh-keygen -t rsa"
ssh root@ha-2 "echo \`cat ~/id_rsa.pub\` >> ~/.ssh/authorized_keys"  
ssh root@ha-2 "rm ~/id_rsa.pub"
scp root@ha-2:/root/.ssh/id_rsa.pub /root
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys
rm ~/id_rsa.pub

TODO: Automate the above three steps with Fabric

Setup network

auto eth0 eth0:0

iface eth0 inet static
 address <public IP address>
 netmask 255.255.255.0
 gateway <gateway>

iface eth0:0 inet static
 address <priv IP address>
 netmask 255.255.128.0

iface eth0 inet6 static
 address <ipv6 address>
 netmask 64
 gateway fe80::1

Install DRBD

apt-get install drbd8-utils
# We will let Pacemaker to manage this service
update-rc.d -f drbd remove

Kernel module version has to match userland tools version exactly

# drbdadm -V
DRBD module version: 8.3.13
   userland version: 8.3.11

If it does not, you have to upgrade your drbd tools!

apt-get remote drbd8-utils
wget http://oss.linbit.com/drbd/8.3/drbd-8.3.13.tar.gz
tar zxf drbd-8.3.13.tar.gz
cd drbd<TAB>
./configure
make
make install

Configure DRBD resources in /etc/drbd.d/r0.res

resource r0 {
    protocol C;
    syncer {
        rate 4M;
    }
    startup {
        wfc-timeout 15;
        degr-wfc-timeout 60;
    }
    net {
        cram-hmac-alg sha1;
        shared-secret "DONTTELL!";
    }
    on ha-1 {
        device /dev/drbd0;
        disk /dev/xvdc;
        address 192.168.x.x:7788;
        meta-disk internal;
    }
    on ha-2 {
        device /dev/drbd0;
        disk /dev/xvdc;
        address 192.168.y.y:7788;
        meta-disk internal;
    }
}

Copy DRBD configuration to ha-2 and prepare DRBD devices

scp /etc/drbd.d/r0.res root@ha-2:/etc/drbd.d/
dd if=/dev/zero of=/dev/xvdc bs=1024k
drbdadm dump all
drbdadm -- --ignore-sanity-checks create-md r0
service drbdadm start
drbdadm up r0
drbdadm -- --overwrite-data-of-peer primary r0
watch -n3 cat /proc/drbd

Populate DRBD with data and mount the filesystem

drbdadm primary r0
mkfs.ext4 /dev/drbd0
mkdir /mnt/nfs
mount /dev/drbd0 /mnt/nfs
drbd-overview or service drbd status or cat /proc/drbd

Install NFS

apt-get install nfs-kernel-server

In /etc/exports

/mnt/nfs/     10.0.0.0/8(rw,async,no_root_squash,no_subtree_check)

Export the share

exportfs -ra

Install Cluster Software on both nodes

apt-get install pacemaker corosync

Configure Corosync.

Supports unicast since v1.3

TODO: Publish config files

Create /etc/corosync/authkey file for node authentication:

corosync-keygen
# run ls -R / to generate entropy in another session for the above command

Copy configuration to the other node

rsync -av /etc/corosync/* ha-2:/etc/corosync/

Start Corosync on the both nodes

sed -i s/START=no/START=yes/ /etc/default/corosync
service corosync start

Configure Pacemaker

Turn off STONITH and Quorum for now

crm configure property stonith-enabled=false
crm configure property no-quorum-policy=ignore

Floating IP address resource

crm(live)configure# primitive p_IP \
> ocf:heartbeat:IPaddr2 \
> params ip=192.168.x.x \
> cidr_netmask=24 \
> op monitor interval=30s
crm(live)configure# commit

DRBD Master / Slave resource

crm(live)configure# primitive p_DRBD_NFS \
> ocf:linbit:drbd \
> params drbd_resource=r0 \
> op monitor interval=15 role=Master \
> op monitor interval=30 role=Slave
crm(live)configure# commit

The NFS kernel server resource

crm(live)configure# ms ms_DRBD_NFS p_DRBD_NFS \
> meta master-max=1 master-node-max=1 clone-max=2 \
> clone-node-max=1 notify=true
crm(live)configure# commit

Filesystem resource

crm(live)configure# primitive p_lsb_NFSserver \
> lsb:nfs-kernel-server \
> op monitor interval=30s
crm(live)configure# clone cl_lsb_NFSserver p_lsb_NFSserver
crm(live)configure# commit
crm(live)configure# primitive p_FS_NFS \
> ocf:heartbeat:Filesystem \
> params device=/dev/drbd0 \
> directory=/mnt/nfs \
> fstype=ext4 \
> options=noatime,nodiratime \
> op start interval=0 timeout=60 \
> op stop interval=0 timeout=120

Before committing, combine Filesystem and IP address resoures into a group

crm(live)configure# group g_NFS \
> p_FS_NFS p_IP

Also, make sure filesystem is started on the same node where the DRBD Master/Slave resource is in the Master role

crm(live)configure# order o_DRBD_before_NFS inf: \
> ms_DRBD_NFS:promote p_IP:start
crm(live)configure# colocation c_NFS_on_DRBD inf: \
> p_IP ms_DRBD_NFS:Master
crm(live)configure# commit

Put the node on stand-by and bring it back:

crm node standby
crm node online

Observe the status

crm_mon -1
crm configure show

Verify cluster configuration

crm_verify -L -V

Verify cluster communication

corosync-cfgtool -s

Also check Corosync member list

corosync-objctl | grep member

Testing (TODO)

Find out which version of NFS server is running

nfsstat | grep "Server nfs"

From a third instance:

Show the servers exported filesystems:

showmount -e <floating_IP>

Then mount

mount -t nfs <floating_IP>:/mnt/nfs /mnt/nfs