Wednesday, January 12, 2011

Highly Available Zabbix Monitoring Server Using Corosync + Pacemaker + DRBD

I recently built a highly available Zabbix monitoring server for a client. It uses the Linux HA tools Corosync and Pacemaker to cluster the services. Linbit's DRBD is used as the cluster storage.

This configuration uses Ubuntu Server 10.04 LTS (Lucid) for the two cluster nodes Linux distribution. These instructions should work on Ubuntu 10.10 and Debian 6.0 (Squeez) with minor changes.

Server Network Configuration
virt ip  192.168.0.20
zbx-01   192.168.0.21
zbx-02   192.168.0.22

I built this configuration on Linux KVM machines using VirtIO disks. These disks show up as /dev/vd* instead of the typical /dev/sd* convention. Make sure you make changes as necessary for your environment.

Each server has second virtual disk that will be used by DRBD.

Setup DRBD

Begin with DRBD. It works as the block device from which a file system will store MySQL's data files. It is available in official Ubuntu repositories.

sudo apt-get install linux-headers-server psmisc drbd8-utils

Make a DRBD block device configuration file in the /etc/drbd.d/mysql_r0.res.

resource mysql_r0 {
    syncer {
        rate  110M;
    }
    on zbx-01 {
        device    /dev/drbd1;
        disk      /dev/vdb;
        address   192.168.0.21:7789;
        meta-disk internal;
    }
    on zbx-02 {
        device    /dev/drbd1;
        disk      /dev/vdb;
        address   192.168.0.22:7789;
        meta-disk internal;
    }
}

Some important things to know:

  • The DRBD daemon expects the file to end with ".res"
  • Make sure to change device and IP address for your environment.
  • Syncer rate 110M is for 1Gb network connections.
  • The host names of each machine must match DRBD resource names

Create the DRBD meta data on the resource device.

sudo drbdadm create-md mysql_r0

Now repeat the previous steps on the second server, zbx-02.

Start the DRBD service on both servers.

/etc/init.d/drbd start

Use zbx-01 as primary server for start. You'll use it to create filesystem and force the other DRBD server on zbx-02 to sync from it.

On zbx-01:
sudo drbdadm -- --overwrite-data-of-peer primary mysql_r0
sudo drbdadm primary mysql_r0
sudo mkfs.ext4 /dev/drbd1

Depending on the size of your DRBD disk, it may take a minute or so to synchronize the two resources. I like to monitor the progress of this initial sync using the follow command.

watch cat /proc/drbd

Now mount the DRBD resource.

sudo mount /dev/drbd1 /mnt

Remove the DRBD LSB init links since the service start and stop will be controlled by Pacemaker.

sudo update-rc.d -f drbd remove

MySQL Server Installation and Configuration

Install the MySQL Server packages.

sudo apt-get install mysql-server

Stop the MySQL Server daemon.

sudo /etc/init.d/mysql stop

Copy the MySQL data directory to the DRBD supported mount.

sudo cp -av /var/lib/mysql/ /mnt/

Edit the the /etc/mysql/mysql.cnf file. Change the bind address to that of the virtual IP. Set the datadir property to point to the DRBD mount you specified earlier. In this example it is important to note that we are using the /mnt folder for simplicity. You will most likely want to change this to something like /mnt/drbd1 for production use.

/etc/mysql/my.cnf
[mysqld]

user            = mysql
socket          = /var/run/mysqld/mysqld.sock
port            = 3306
basedir         = /usr
datadir         = /mnt/mysql
tmpdir          = /tmp
skip-external-locking


I like to add the following InnoDB properties to the MySQL my.cnf file. These settings are tuned for a 4 cpu 4G memory machines. MySQL and DRBD pros recommend using InnoDB engine because it has the much better recovery characteristics than old MyISAM. I set my server to default to the InnoDB engine for this reason.

/etc/mysql/my.cnf
...

#
# * Make InnoDB the default engine
#
default-storage-engine    = innodb

#
# * Innodb Performance Settings
#
innodb_buffer_pool_size         = 1600M
innodb_log_file_size            = 256M
innodb_log_buffer_size          = 4M
innodb_flush_log_at_trx_commit  = 2
innodb_thread_concurrency       = 8
innodb_flush_method             = O_DIRECT
innodb_file_per_table

...

Repeat the previous MySQL /etc/mysql/my.cnf changes on zbx-02.

You may need to delete the InnoDB data files if you have changed the default settings to the performance ones I used. DO NOT DO THIS ON A SYSTEM IN PRODUCTION!

cd /mnt/mysql
sudo rm ib*

On zbx-01 try starting the MySQL Server.

sudo /etc/init.d/mysql start

Watch the /var/log/mysql/mysql.err for any problems. Logging in with a mysql client is also a good idea.

Stop MySQL once you've confirmed it's running properly on the DRBD resource.

Remove the MySQL LSB daemon start links so they do not conflict with Pacemaker.

sudo update-rc.d -f mysql remove

There is also an Upstart script included with the Ubuntu MySQL Server package. You'll need to edit it so that it doesn't try to start the service on boot up.

Comment out the start, stop and respawn command in /etc/init/mysql.conf. It should look like this example snip-it.

# MySQL Service

description     "MySQL Server"
author          "Mario Limonciello "

#start on (net-device-up
#          and local-filesystems
#         and runlevel [2345])
#stop on runlevel [016]

#respawn

env HOME=/etc/mysql
umask 007

...

Repeat this step on zbx-02.

Install and Configure Corosync and Pacemaker

Pacemaker with Corosync is included in the Ubuntu 10.04 LTS repositories.

sudo apt-get install pacemaker

Edit the /etc/default/corosync file using your favorite text editor and enable corosync (START=yes).

Pacemaker uses encrypted connections between the cluster nodes so you need to generate a corosync authkey file.

sudo corosync-keygen

*Note!* This can take a while if there's no enough entropy.

Copy the /etc/corosync/authkey to all servers that will form this cluster. Make sure it is owned by root:root and has 400 permissions.

In /etc/corosync/corosync.conf replace bindnetaddr (by defaults it's 127.0.0.1) with network address of your server, replacing last digit with 0. For example, if your IP is 192.168.0.21, then you would put 192.168.0.0.

Start the Corosync daemon.

sudo /etc/init.d/corosync start

Now your cluster is configured and ready to monitor, stop and start your services on all your cluster servers.

You can check the status with the crm status command.

crm status
============
Last updated: Wed Sep 15 11:33:09 2010
Stack: openais
Current DC: zbx-01 - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
0 Resources configured.
============

Online: [ zbx-01 zbx-02 ]

Now update the Corosync CRM configuration to include DRBD and MySQL.

sudo crm configure edit

Here's a working example but be sure to edit for your environment.

node zbx-01 \
        attributes standby="off"
node zbx-02 \
        attributes standby="off"
primitive drbd_mysql ocf:linbit:drbd \
        params drbd_resource="mysql_r0" \
        op monitor interval="15s"
primitive fs_mysql ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/mysql_r0" directory="/mnt/" fstype="ext4" options="acl"
primitive ip_mysql ocf:heartbeat:IPaddr2 \
        params ip="192.168.0.20" nic="eth0"
primitive mysqld lsb:mysql \
        op start interval="0" timeout="120s" \
        op stop interval="0" timeout="120s" \
        op monitor interval="30s"
group zabbix_group fs_mysql ip_mysql mysqld \
        meta target-role="Started"
ms ms_drbd_mysql drbd_mysql \
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Master"
colocation mysql_on_drbd inf: _rsc_set_ zabbix_group ms_drbd_mysql:Master
order mysql_after_drbd inf: _rsc_set_ ms_drbd_mysql:promote zabbix_group:start
property $id="cib-bootstrap-options" \
        dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        no-quorum-policy="ignore" \
        stonith-enabled="false" \
        last-lrm-refresh="1294782404"

Some notes about this configuration:

  • It monitors the DRBD resource every 15s
  • The takeover IP address is 192.168.0.20
  • MySQL Server is allowed 2 minutes to startup in case it need to perform recovery operations on the Zabbix database
  • The STONITH property is disabled since we are only setting up a two node cluster.

You can check the status of the cluster with the crm_mon utility.

sudo crm_mon

Here's and example of what you want to see:

============
Last updated: Wed Mar 11 23:04:49 2011
Stack: openais
Current DC: zbx-01 - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ zbx-01 zbx-02 ]

 Resource Group: zabbix_group
     fs_mysql (ocf::heartbeat:Filesystem): Started zbx-01
     ip_mysql (ocf::heartbeat:IPaddr2): Started zbx-01
     mysqld (lsb:mysql): Started zbx-01
 Master/Slave Set: ms_drbd_mysql
     Masters: [ zbx-01 ]
     Slaves: [ zbx-02 ]

Install Zabbix Server

How you install Zabbix is up to you. I like to use recompile the latest upstream Debian packages but using the older Ubuntu Lucid repository version or the official tarball will also work. If you use the apt package remember to not use the dbconfig-common option on zbx-02. You can copy over the configs files from zbx-01.

sudo apt-get install zabbix-server-mysql 

Edit the /etc/zabbix/zabbix_server.conf file. Set the SourceIP=192.168.0.20 so that Zabbix will use the virtual "take over" ip address. This will make setting up client configurations and firewall rules much easier.

Check your newly installed Zabbix server for a clean start.

sudo tail /var/log/zabbix-server/zabbix-server.log

Remove the LSB init script links.

sudo update-rc.d -f zabbix-server remove

Install Apache and Zabbix PHP frontend.

sudo apt-get install apache2 php5 php5-mysql php5-ldap php5-gd zabbix-frontend-php

Remove Apache's auto start links.

sudo update-rc.d -f zabbix-server remove

Repeat on zbx-02.

Copy the configuration file from zbx-01's /etc/zabbix directory to zbx-02's /etc/zabbix folder.

Update Corosync Configuration With Zabbix and Apache

sudo crm configure edit

Working example:
node zbx-01 \
        attributes standby="off"
node zbx-02 \
        attributes standby="off"
primitive apache lsb:apache2
primitive drbd_mysql ocf:linbit:drbd \
        params drbd_resource="mysql_r0" \
        op monitor interval="15s"
primitive fs_mysql ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/mysql_r0" directory="/mnt/" fstype="ext4" options="acl"
primitive ip_mysql ocf:heartbeat:IPaddr2 \
        params ip="192.168.0.20" nic="eth0"
primitive mysqld lsb:mysql \
        op start interval="0" timeout="120s" \
        op stop interval="0" timeout="120s" \
        op monitor interval="30s"
primitive zabbix lsb:zabbix-server \
        op start interval="0" timeout="60" delay="5s" \
        op monitor interval="30s"
group zabbix_group fs_mysql ip_mysql mysqld zabbix apache \
        meta target-role="Started"
ms ms_drbd_mysql drbd_mysql \
        meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Master"
colocation mysql_on_drbd inf: _rsc_set_ zabbix_group ms_drbd_mysql:Master
order mysql_after_drbd inf: _rsc_set_ ms_drbd_mysql:promote zabbix_group:start
property $id="cib-bootstrap-options" \
        dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        no-quorum-policy="ignore" \
        stonith-enabled="false" \
        last-lrm-refresh="1294782404"

10 comments:

  1. Thanks - very useful.
    However I ran into some issues with Apparmor and the changed mountpoint for MySQL to /mnt/mysql.
    Had to create an entry in tunables/alias for this to get the database to start.

    ReplyDelete
  2. Great post thank you! I have only one question about fail-over, if zbx-01 goes down will zbx-02 take all services?

    BTW you can use this location too:
    location master-prefer-zbx-01 zabbix_group 50: zbx-01

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. In between these two steps:

    sudo drbdadm -- --overwrite-data-of-peer primary mysql_r0
    [...insert 'primary' command here...]
    sudo mkfs.ext4 /dev/drbd1

    ...Pacemaker docs say you need to run this on one of the nodes in the cluster:
    sudo drbdadm primary mysql_r0

    If you don't, it has no way to know which node it should treat as "the One True Brain" after both nodes have been down/offline for any reason. You'll end up in split-brain (which is what happened to me before I found out about this command).

    ReplyDelete
  5. Hi,

    Any chance i could get an explanation on why Stonith is not needed as it's a 2 node cluster? The other pacemaker docs i read seem to imply Stonith is needed but I'm having a lot of struggles with getting Stonith working so would be more than happy to drop it if i truly don't need it.

    Cheers!

    ReplyDelete
  6. I don't normally utilize STONITH when there are just two nodes participating in a DRBD mirror since only one block device is writable. Zabbix is setup to notify me when Corosync fails over to the second node. Running a MySQL database on a shared filesystem is a different story.

    ReplyDelete
  7. Jim,

    Thanks for catching that undocumented step. I've updated the instructions.

    ReplyDelete
  8. Hi,

    Congratulations for this post.
    But I found a little mistake...

    You said:
    Remove Apache's auto start links.

    sudo update-rc.d -f zabbix-server remove

    And it should be:
    sudo update-rc.d -f apache2 remove

    ReplyDelete
  9. Is setting O_DIRECT not a problem as this disables the buffer cache which DRBD hook relies upon for sync?

    ReplyDelete
  10. This is the most useful article I found about drbd - pacemaker - mysql.
    Thank you!

    ReplyDelete