Friday, February 18, 2011

Get those FSYNC numbers up on your ZFS pool

For the last week, I've been trying to figure out why our 10 drive ZFS zpool has been delivering such lousy NFS performance Proxmox KVM cluster.

Here's what pveperf was returning:
pveperf /mnt/pve/kvm-images/
CPU BOGOMIPS:      76608.87
REGEX/SECOND:      896132
HD SIZE:           7977.14 GB (xxx.xxx.xxx.xxx:/volumes/vol0/kvm-images)
FSYNCS/SECOND:     23.15
DNS EXT:           58.84 ms
DNS INT:           1.50 ms (my.company.com)

The zpool looked like this:
zpool status vol0
  pool: vol0
 state: ONLINE
 scan: none requested
config:

        NAME                       STATE     READ WRITE CKSUM
        vol0                       ONLINE       0     0     0
          raidz2-0                 ONLINE       0     0     0
            c0t5000C50010377B5Bd0  ONLINE       0     0     0
            c0t5000C5001037C317d0  ONLINE       0     0     0
            c0t5000C5001037EED7d0  ONLINE       0     0     0
            c0t5000C50010381737d0  ONLINE       0     0     0
            c0t5000C50010381BBBd0  ONLINE       0     0     0
            c0t5000C50010382777d0  ONLINE       0     0     0
            c0t5000C5001038291Fd0  ONLINE       0     0     0
            c0t5000C500103870A3d0  ONLINE       0     0     0
            c0t5000C500103871C3d0  ONLINE       0     0     0
            c0t5000C500103924E3d0  ONLINE       0     0     0
            c0t5000C500103941F7d0  ONLINE       0     0     0
        cache
          c0t50015179591D9AEFd0    ONLINE       0     0     0
          c0t50015179591DACA1d0    ONLINE       0     0     0
          c1t2d0                   ONLINE       0     0     0
        spares
          c0t5000C50010395057d0    AVAIL   

errors: No known data errors

Raw write speed wasn't a problem. Tests of copying DVD iso files were supper fast over the 10G network backbone. But the performance of creating new files and folders really hurt. This was very apparent when I started using bonnie++ on the NFS shares from the Proxmox nodes. Bonnie++ zipped along until it started its "Create files..." tests. The Linux client would practically lock up.

So a little Goolge ZFS keyword searching later, I came across Joe Little's blog post, ZFS Log Devices: A Review of the DDRdrive X1. This got me thinking about my zpool setup. Looking at the configuration again, I realized that I'd made a mistake and added the second Intel X25M SSD to the cache pool instead of the log pool. :)

Thanks to ZFS awesomeness it was real easy to pull the SSD out of the cache and designate it as part of the log pool. No down time for the production system and no wasted weird weekend hours staring at glowing terminal console.

Oh man, did that make a difference in performance.

Here's what the reconfigured vol0 zpool looks like:
zpool status vol0
  pool: vol0
 state: ONLINE
 scan: none requested
config:

        NAME                       STATE     READ WRITE CKSUM
        vol0                       ONLINE       0     0     0
          raidz2-0                 ONLINE       0     0     0
            c0t5000C50010377B5Bd0  ONLINE       0     0     0
            c0t5000C5001037C317d0  ONLINE       0     0     0
            c0t5000C5001037EED7d0  ONLINE       0     0     0
            c0t5000C50010381737d0  ONLINE       0     0     0
            c0t5000C50010381BBBd0  ONLINE       0     0     0
            c0t5000C50010382777d0  ONLINE       0     0     0
            c0t5000C5001038291Fd0  ONLINE       0     0     0
            c0t5000C500103870A3d0  ONLINE       0     0     0
            c0t5000C500103871C3d0  ONLINE       0     0     0
            c0t5000C500103924E3d0  ONLINE       0     0     0
            c0t5000C500103941F7d0  ONLINE       0     0     0
        logs
          c1t2d0                   ONLINE       0     0     0
        cache
          c0t50015179591D9AEFd0    ONLINE       0     0     0
          c0t50015179591DACA1d0    ONLINE       0     0     0
        spares
          c0t5000C50010395057d0    AVAIL   

errors: No known data errors

Now ZFS can properly feed all of the Linux FSYNC disk requests. Check out the Proxmox performance test improvements.

pveperf /mnt/pve/kvm-images/
CPU BOGOMIPS:      76608.87
REGEX/SECOND:      896132
HD SIZE:           7977.14 GB (xxx.xxx.xxx.xxx:/volumes/vol0/kvm-images)
FSYNCS/SECOND:     1403.21
DNS EXT:           58.84 ms
DNS INT:           1.50 ms (my.company.com)

1 comment: