ZFS FUSE Disk Striping and RAID-Z on Ubuntu 7.04 Feisty Fawn

Here’s a quick update on my previous results for striping and RAID-Z when testing ZFS on an old system with multiple SCSI drives (see the previous post for details of the system config).

First up I dist-upgraded from Edgy to Feisty and then ran a quick Bonnie++ on a single disk with an XFS partition as a control to see what effect going to Feisty had. It previously took almost 9m44s.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
netstrada      496M           10568  42  4768  16           10206  15 181.0   3
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   267  17 +++++ +++   253  12   263  17 +++++ +++   135   7
netstrada,496M,,,10568,42,4768,16,,,10206,15,181.0,3,16,267,17,+++++,+++,253,12,263,17,+++++,+++,135,7

real    9m25.753s
user    0m1.950s
sys     1m26.350s

A little better, not a massive change.

OK – so lets run the system using the same binaries and disk setup that existed under Edgy (previously took 16m32s).

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
netstrada      496M            3159   2  1772   2            5963   4  38.0   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   321   5   778   8   271   3   322   5  1078  11   327   4
netstrada,496M,,,3159,2,1772,2,,,5963,4,38.0,0,16,321,5,778,8,271,3,322,5,1078,11,327,4

real    16m42.484s
user    0m2.730s
sys     0m26.990s

So no real change at all there either, if anything slightly slower!

So next lets try the same disk layout. but with the the current snapshot of binaries (up to and including changeset 228) built under Feisty.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
netstrada      496M            4562  11  2065   7            4363   6  76.7   2
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   520   8   895  10   506   6   514   8  1069  10   484   6
netstrada,496M,,,4562,11,2065,7,,,4363,6,76.7,2,16,520,8,895,10,506,6,514,8,1069,10,484,6

real    12m29.554s
user    0m2.340s
sys     0m54.530s

Well that’s a significant change! Over 4 minutes quicker by just upgrading to the latest code.

But because the latest version is based off a newer ZFS version from Sun we have an interesting message from zpool status.

  pool: test
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on older software versions.
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        test        ONLINE       0     0     0
          sdb       ONLINE       0     0     0
          sdc       ONLINE       0     0     0
          sdd       ONLINE       0     0     0
          sde       ONLINE       0     0     0

errors: No known data errors

OK, so lets take it at its word and upgrade the on-disk format. The command we want is zpool upgrade test.

This system is currently running ZFS version 6.

Successfully upgraded 'test' from version 3 to version 6

That took less than 2 seconds, but then we don’t have any actual data on this array. So what happens with the new format ?

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
netstrada      496M            4412  11  2063   7            4048   6  75.9   2
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   486   8   901   9   526   7   502   8  1142  12   554   7
netstrada,496M,,,4412,11,2063,7,,,4048,6,75.9,2,16,486,8,901,9,526,7,502,8,1142,12,554,7

real    12m40.367s
user    0m2.460s
sys     0m54.880s

Oh, slightly slower, but probably not a significant difference.

Now I’m curious about whether we are seeing any I/O scaling effects over various drives, or whether the ZFS/FUSE code is just topping out at 4MB/S at the moment, so here’s a 2 disk stripe..

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
netstrada      496M            4467  12  2025   7            4468   7  62.1   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   516   9   879   9   440   5   553   9  1130  12   564   7
netstrada,496M,,,4467,12,2025,7,,,4468,7,62.1,1,16,516,9,879,9,440,5,553,9,1130,12,564,7

real    12m56.792s
user    0m2.230s
sys     0m55.680s

So virtually the same as the 4 disk stripe. It’s not the RAID controller as the XFS result shows we can do at least twice as much for a single drive as we are getting over the striped array.

OK – so last test, what effect (if any) has this had on RAIDZ performance ? Previously this test took a whacking great 21m9s.

ZFS - 4 drive RAIDZ - Feisty binaries - New format

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
netstrada      496M            4295  11  1549   5            3595   5  41.4   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16   501   8   871   9   462   5   544   9  1133  10   553   7
netstrada,496M,,,4295,11,1549,5,,,3595,5,41.4,1,16,501,8,871,9,462,5,544,9,1133,10,553,7

real    15m52.292s
user    0m2.240s
sys     0m53.930s

Wow, about 5 minutes faster than the previous version of ZFS on Edgy and giving infinitely better resilience than simple striping!

5 thoughts on “ZFS FUSE Disk Striping and RAID-Z on Ubuntu 7.04 Feisty Fawn

  1. Intriguingly, ZFS-FUSE is much faster running on top of Linux software RAID than it is using RAID-Z over raw disc devices. This suggests that a lot of its slowness is not just due to FUSE overhead. Another performance quirk that I’ve noticed was that the latency is absolutely horrible when multitasking. It wouldn’t surprise me if this one might be a FUSE limitation (e.g. if FUSE required requests to be dealt with in the order they were received). As I wrote on the ZFS-FUSE mailing list – “According to bonnie, I see 125 MB/s reads on ext3 RAID5, 65 MB/s on ZFS RAID5 (using Linux’s software RAID) and 20 MB/s on ZFS raidz (using the same raw drives). Writes are also proportionally slower.”

  2. Hello Cameron,

    It does look like ZFS-FUSE scales to around half the available disk bandwidth of a single underlying drive for some reason. So on the above server that’s a 10MB/s to 4.xMB/s, for you that’s 128MB/s to 65 MB/s and on my desktop box it’s 39MB/s to 18MB/s (on MD RAID-1).

    What I’d love to know is what do you get for ext3 on a single drive ? My prediction would be that you would get around 35-45MB/s based on that theory.

    There is evidence that FUSE filesystems can match in-kernel filesystems, so I think it’s more likely that, as Riccardo says, he’s not even started on performance optimisations yet and so there is probably much to look forward to! 🙂

  3. Pingback: Comparing NTFS-3G to ZFS-FUSE for FUSE Performance at The Musings of Chris Samuel

  4. Great post!

    We’re running zfs v4 and are considering going straight up to 10. I was unaware of the “zpool upgrade” issue (probably because our version is a version or so too “old” 😉

    Thanks for the info!

    , Mike

Comments are closed.