I recently built a highly available Zabbix monitoring server for a client. It uses the Linux HA tools Corosync and Pacemaker to cluster the services. Linbit's DRBD is used as the cluster storage.
This configuration uses Ubuntu Server 10.04 LTS (Lucid) for the two cluster nodes Linux distribution. These instructions should work on Ubuntu 10.10 and Debian 6.0 (Squeez) with minor changes.
Server Network Configuration
virt ip 192.168.0.20
zbx-01 192.168.0.21
zbx-02 192.168.0.22
I built this configuration on Linux KVM machines using VirtIO disks. These disks show up as /dev/vd* instead of the typical /dev/sd* convention. Make sure you make changes as necessary for your environment.
Each server has second virtual disk that will be used by DRBD.
Setup DRBD
Begin with DRBD. It works as the block device from which a file system will store MySQL's data files. It is available in official Ubuntu repositories.
sudo apt-get install linux-headers-server psmisc drbd8-utils
Make a DRBD block device configuration file in the /etc/drbd.d/mysql_r0.res.
resource mysql_r0 {
syncer {
rate 110M;
}
on zbx-01 {
device /dev/drbd1;
disk /dev/vdb;
address 192.168.0.21:7789;
meta-disk internal;
}
on zbx-02 {
device /dev/drbd1;
disk /dev/vdb;
address 192.168.0.22:7789;
meta-disk internal;
}
}
Some important things to know:
- The DRBD daemon expects the file to end with ".res"
- Make sure to change device and IP address for your environment.
- Syncer rate 110M is for 1Gb network connections.
- The host names of each machine must match DRBD resource names
Create the DRBD meta data on the resource device.
sudo drbdadm create-md mysql_r0
Now repeat the previous steps on the second server, zbx-02.
Start the DRBD service on both servers.
/etc/init.d/drbd start
Use zbx-01 as primary server for start. You'll use it to create filesystem and force the other DRBD server on zbx-02 to sync from it.
On zbx-01:
sudo drbdadm -- --overwrite-data-of-peer primary mysql_r0
sudo drbdadm primary mysql_r0
sudo mkfs.ext4 /dev/drbd1
Depending on the size of your DRBD disk, it may take a minute or so to synchronize the two resources. I like to monitor the progress of this initial sync using the follow command.
watch cat /proc/drbd
Now mount the DRBD resource.
sudo mount /dev/drbd1 /mnt
Remove the DRBD LSB init links since the service start and stop will be controlled by Pacemaker.
sudo update-rc.d -f drbd remove
MySQL Server Installation and Configuration
Install the MySQL Server packages.
sudo apt-get install mysql-server
Stop the MySQL Server daemon.
sudo /etc/init.d/mysql stop
Copy the MySQL data directory to the DRBD supported mount.
sudo cp -av /var/lib/mysql/ /mnt/
Edit the the /etc/mysql/mysql.cnf file. Change the bind address to that of the virtual IP. Set the datadir property to point to the DRBD mount you specified earlier. In this example it is important to note that we are using the /mnt folder for simplicity. You will most likely want to change this to something like /mnt/drbd1 for production use.
/etc/mysql/my.cnf
[mysqld]
user = mysql
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /mnt/mysql
tmpdir = /tmp
skip-external-locking
I like to add the following InnoDB properties to the MySQL my.cnf file. These settings are tuned for a 4 cpu 4G memory machines. MySQL and DRBD pros recommend using InnoDB engine because it has the much better recovery characteristics than old MyISAM. I set my server to default to the InnoDB engine for this reason.
/etc/mysql/my.cnf
...
#
# * Make InnoDB the default engine
#
default-storage-engine = innodb
#
# * Innodb Performance Settings
#
innodb_buffer_pool_size = 1600M
innodb_log_file_size = 256M
innodb_log_buffer_size = 4M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_file_per_table
...
Repeat the previous MySQL /etc/mysql/my.cnf changes on zbx-02.
You may need to delete the InnoDB data files if you have changed the default settings to the performance ones I used. DO NOT DO THIS ON A SYSTEM IN PRODUCTION!
cd /mnt/mysql
sudo rm ib*
On zbx-01 try starting the MySQL Server.
sudo /etc/init.d/mysql start
Watch the /var/log/mysql/mysql.err for any problems. Logging in with a mysql client is also a good idea.
Stop MySQL once you've confirmed it's running properly on the DRBD resource.
Remove the MySQL LSB daemon start links so they do not conflict with Pacemaker.
sudo update-rc.d -f mysql remove
There is also an Upstart script included with the Ubuntu MySQL Server package. You'll need to edit it so that it doesn't try to start the service on boot up.
Comment out the start, stop and respawn command in /etc/init/mysql.conf. It should look like this example snip-it.
# MySQL Service
description "MySQL Server"
author "Mario Limonciello "
#start on (net-device-up
# and local-filesystems
# and runlevel [2345])
#stop on runlevel [016]
#respawn
env HOME=/etc/mysql
umask 007
...
Repeat this step on zbx-02.
Install and Configure Corosync and Pacemaker
Pacemaker with Corosync is included in the Ubuntu 10.04 LTS repositories.
sudo apt-get install pacemaker
Edit the /etc/default/corosync file using your favorite text editor and enable corosync (START=yes).
Pacemaker uses encrypted connections between the cluster nodes so you need to generate a corosync authkey file.
sudo corosync-keygen
*Note!* This can take a while if there's no enough entropy.
Copy the /etc/corosync/authkey to all servers that will form this cluster. Make sure it is owned by root:root and has 400 permissions.
In /etc/corosync/corosync.conf replace bindnetaddr (by defaults it's 127.0.0.1) with network address of your server, replacing last digit with 0. For example, if your IP is 192.168.0.21, then you would put 192.168.0.0.
Start the Corosync daemon.
sudo /etc/init.d/corosync start
Now your cluster is configured and ready to monitor, stop and start your services on all your cluster servers.
You can check the status with the crm status command.
crm status
============
Last updated: Wed Sep 15 11:33:09 2010
Stack: openais
Current DC: zbx-01 - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ zbx-01 zbx-02 ]
Now update the Corosync CRM configuration to include DRBD and MySQL.
sudo crm configure edit
Here's a working example but be sure to edit for your environment.
node zbx-01 \
attributes standby="off"
node zbx-02 \
attributes standby="off"
primitive drbd_mysql ocf:linbit:drbd \
params drbd_resource="mysql_r0" \
op monitor interval="15s"
primitive fs_mysql ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/mysql_r0" directory="/mnt/" fstype="ext4" options="acl"
primitive ip_mysql ocf:heartbeat:IPaddr2 \
params ip="192.168.0.20" nic="eth0"
primitive mysqld lsb:mysql \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="30s"
group zabbix_group fs_mysql ip_mysql mysqld \
meta target-role="Started"
ms ms_drbd_mysql drbd_mysql \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Master"
colocation mysql_on_drbd inf: _rsc_set_ zabbix_group ms_drbd_mysql:Master
order mysql_after_drbd inf: _rsc_set_ ms_drbd_mysql:promote zabbix_group:start
property $id="cib-bootstrap-options" \
dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
last-lrm-refresh="1294782404"
Some notes about this configuration:
- It monitors the DRBD resource every 15s
- The takeover IP address is 192.168.0.20
- MySQL Server is allowed 2 minutes to startup in case it need to perform recovery operations on the Zabbix database
- The STONITH property is disabled since we are only setting up a two node cluster.
You can check the status of the cluster with the crm_mon utility.
sudo crm_mon
Here's and example of what you want to see:
============
Last updated: Wed Mar 11 23:04:49 2011
Stack: openais
Current DC: zbx-01 - partition with quorum
Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ zbx-01 zbx-02 ]
Resource Group: zabbix_group
fs_mysql (ocf::heartbeat:Filesystem): Started zbx-01
ip_mysql (ocf::heartbeat:IPaddr2): Started zbx-01
mysqld (lsb:mysql): Started zbx-01
Master/Slave Set: ms_drbd_mysql
Masters: [ zbx-01 ]
Slaves: [ zbx-02 ]
Install Zabbix Server
How you install Zabbix is up to you. I like to use recompile the latest upstream Debian packages but using the older Ubuntu Lucid repository version or the official tarball will also work. If you use the apt package remember to not use the dbconfig-common option on zbx-02. You can copy over the configs files from zbx-01.
sudo apt-get install zabbix-server-mysql
Edit the /etc/zabbix/zabbix_server.conf file. Set the SourceIP=192.168.0.20 so that Zabbix will use the virtual "take over" ip address. This will make setting up client configurations and firewall rules much easier.
Check your newly installed Zabbix server for a clean start.
sudo tail /var/log/zabbix-server/zabbix-server.log
Remove the LSB init script links.
sudo update-rc.d -f zabbix-server remove
Install Apache and Zabbix PHP frontend.
sudo apt-get install apache2 php5 php5-mysql php5-ldap php5-gd zabbix-frontend-php
Remove Apache's auto start links.
sudo update-rc.d -f zabbix-server remove
Repeat on zbx-02.
Copy the configuration file from zbx-01's /etc/zabbix directory to zbx-02's /etc/zabbix folder.
Update Corosync Configuration With Zabbix and Apache
sudo crm configure edit
Working example:
node zbx-01 \
attributes standby="off"
node zbx-02 \
attributes standby="off"
primitive apache lsb:apache2
primitive drbd_mysql ocf:linbit:drbd \
params drbd_resource="mysql_r0" \
op monitor interval="15s"
primitive fs_mysql ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/mysql_r0" directory="/mnt/" fstype="ext4" options="acl"
primitive ip_mysql ocf:heartbeat:IPaddr2 \
params ip="192.168.0.20" nic="eth0"
primitive mysqld lsb:mysql \
op start interval="0" timeout="120s" \
op stop interval="0" timeout="120s" \
op monitor interval="30s"
primitive zabbix lsb:zabbix-server \
op start interval="0" timeout="60" delay="5s" \
op monitor interval="30s"
group zabbix_group fs_mysql ip_mysql mysqld zabbix apache \
meta target-role="Started"
ms ms_drbd_mysql drbd_mysql \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" target-role="Master"
colocation mysql_on_drbd inf: _rsc_set_ zabbix_group ms_drbd_mysql:Master
order mysql_after_drbd inf: _rsc_set_ ms_drbd_mysql:promote zabbix_group:start
property $id="cib-bootstrap-options" \
dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
last-lrm-refresh="1294782404"