Here’s a quick update on my previous results for striping and RAID-Z when testing ZFS on an old system with multiple SCSI drives (see the previous post for details of the system config).
First up I dist-upgraded from Edgy to Feisty and then ran a quick Bonnie++ on a single disk with an XFS partition as a control to see what effect going to Feisty had. It previously took almost 9m44s.
Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP netstrada 496M 10568 42 4768 16 10206 15 181.0 3 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 267 17 +++++ +++ 253 12 263 17 +++++ +++ 135 7 netstrada,496M,,,10568,42,4768,16,,,10206,15,181.0,3,16,267,17,+++++,+++,253,12,263,17,+++++,+++,135,7 real 9m25.753s user 0m1.950s sys 1m26.350s
A little better, not a massive change.
OK – so lets run the system using the same binaries and disk setup that existed under Edgy (previously took 16m32s).
Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP netstrada 496M 3159 2 1772 2 5963 4 38.0 0 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 321 5 778 8 271 3 322 5 1078 11 327 4 netstrada,496M,,,3159,2,1772,2,,,5963,4,38.0,0,16,321,5,778,8,271,3,322,5,1078,11,327,4 real 16m42.484s user 0m2.730s sys 0m26.990s
So no real change at all there either, if anything slightly slower!
So next lets try the same disk layout. but with the the current snapshot of binaries (up to and including changeset 228) built under Feisty.
Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP netstrada 496M 4562 11 2065 7 4363 6 76.7 2 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 520 8 895 10 506 6 514 8 1069 10 484 6 netstrada,496M,,,4562,11,2065,7,,,4363,6,76.7,2,16,520,8,895,10,506,6,514,8,1069,10,484,6 real 12m29.554s user 0m2.340s sys 0m54.530s
Well that’s a significant change! Over 4 minutes quicker by just upgrading to the latest code.
But because the latest version is based off a newer ZFS version from Sun we have an interesting message from zpool status
.
pool: test state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 sdb ONLINE 0 0 0 sdc ONLINE 0 0 0 sdd ONLINE 0 0 0 sde ONLINE 0 0 0 errors: No known data errors
OK, so lets take it at its word and upgrade the on-disk format. The command we want is zpool upgrade test
.
This system is currently running ZFS version 6. Successfully upgraded 'test' from version 3 to version 6
That took less than 2 seconds, but then we don’t have any actual data on this array. So what happens with the new format ?
Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP netstrada 496M 4412 11 2063 7 4048 6 75.9 2 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 486 8 901 9 526 7 502 8 1142 12 554 7 netstrada,496M,,,4412,11,2063,7,,,4048,6,75.9,2,16,486,8,901,9,526,7,502,8,1142,12,554,7 real 12m40.367s user 0m2.460s sys 0m54.880s
Oh, slightly slower, but probably not a significant difference.
Now I’m curious about whether we are seeing any I/O scaling effects over various drives, or whether the ZFS/FUSE code is just topping out at 4MB/S at the moment, so here’s a 2 disk stripe..
Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP netstrada 496M 4467 12 2025 7 4468 7 62.1 1 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 516 9 879 9 440 5 553 9 1130 12 564 7 netstrada,496M,,,4467,12,2025,7,,,4468,7,62.1,1,16,516,9,879,9,440,5,553,9,1130,12,564,7 real 12m56.792s user 0m2.230s sys 0m55.680s
So virtually the same as the 4 disk stripe. It’s not the RAID controller as the XFS result shows we can do at least twice as much for a single drive as we are getting over the striped array.
OK – so last test, what effect (if any) has this had on RAIDZ performance ? Previously this test took a whacking great 21m9s.
ZFS - 4 drive RAIDZ - Feisty binaries - New format Version 1.03 ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP netstrada 496M 4295 11 1549 5 3595 5 41.4 1 ------Sequential Create------ --------Random Create-------- -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP 16 501 8 871 9 462 5 544 9 1133 10 553 7 netstrada,496M,,,4295,11,1549,5,,,3595,5,41.4,1,16,501,8,871,9,462,5,544,9,1133,10,553,7 real 15m52.292s user 0m2.240s sys 0m53.930s
Wow, about 5 minutes faster than the previous version of ZFS on Edgy and giving infinitely better resilience than simple striping!
Intriguingly, ZFS-FUSE is much faster running on top of Linux software RAID than it is using RAID-Z over raw disc devices. This suggests that a lot of its slowness is not just due to FUSE overhead. Another performance quirk that I’ve noticed was that the latency is absolutely horrible when multitasking. It wouldn’t surprise me if this one might be a FUSE limitation (e.g. if FUSE required requests to be dealt with in the order they were received). As I wrote on the ZFS-FUSE mailing list – “According to bonnie, I see 125 MB/s reads on ext3 RAID5, 65 MB/s on ZFS RAID5 (using Linux’s software RAID) and 20 MB/s on ZFS raidz (using the same raw drives). Writes are also proportionally slower.”
Hello Cameron,
It does look like ZFS-FUSE scales to around half the available disk bandwidth of a single underlying drive for some reason. So on the above server that’s a 10MB/s to 4.xMB/s, for you that’s 128MB/s to 65 MB/s and on my desktop box it’s 39MB/s to 18MB/s (on MD RAID-1).
What I’d love to know is what do you get for ext3 on a single drive ? My prediction would be that you would get around 35-45MB/s based on that theory.
There is evidence that FUSE filesystems can match in-kernel filesystems, so I think it’s more likely that, as Riccardo says, he’s not even started on performance optimisations yet and so there is probably much to look forward to! 🙂
Hi Cameron,
Just reported a test of the NTFS-3G FUSE filesystem on my desktop with software RAID. That was comparable to XFS for write, rewrite and read speeds!
Pingback: Comparing NTFS-3G to ZFS-FUSE for FUSE Performance at The Musings of Chris Samuel
Great post!
We’re running zfs v4 and are considering going straight up to 10. I was unaware of the “zpool upgrade” issue (probably because our version is a version or so too “old” 😉
Thanks for the info!
, Mike