Comparing NTFS-3G to ZFS-FUSE for FUSE Performance

I was wondering whether FUSE was being a bottleneck in my various ZFS-FUSE tests or whether the performance issues at present are just that ZFS is very young code on Linux and that the fact that Riccardo hasn’t yet started on optimisation.

As a quick reminder, here’s what JFS can do on a software RAID 1 array on this desktop:

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           39388  11 24979   5           53968   6 255.4   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  9032  24 +++++ +++  8642  33  2621  18 +++++ +++   993   5
inside,2G,,,39388,11,24979,5,,,53968,6,255.4,1,16,9032,24,+++++,+++,8642,33,2621,18,+++++,+++,993,5

real    4m0.982s
user    0m0.292s
sys     0m17.201s

…and here is how ZFS-FUSE compares…

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           18148   4  9957   3           28767   3 164.4   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3093   6  6201   8  3090   4  2906   5  8592   9  4165   6
inside,2G,,,18148,4,9957,3,,,28767,3,164.4,0,16,3093,6,6201,8,3090,4,2906,5,8592,9,4165,6

real    7m59.385s
user    0m1.140s
sys     0m16.201s

That’s of the order of half the speed. So, how does NTFS-3G compare ? Here are the results:

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           31222   6 14076   4           30118   2 137.5   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  1780   4 15379  10  4521   8  3276   7 16683  14  4429   5
inside,2G,,,31222,6,14076,4,,,30118,2,137.5,0,16,1780,4,15379,10,4521,8,3276,7,16683,14,4429,5

real    6m14.292s
user    0m1.032s
sys     0m14.173s

So at first blush it looks somewhere between the two with a 6+ minute run time, but the disk write & re-write speeds are substantially better than ZFS.

But where it gets really interesting is comparing NTFS-3G with my XFS results, below:

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           31444   9 15949   4           30409   4 261.3   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3291  21 +++++ +++  2720  14  3286  22 +++++ +++   874   7
inside,2G,,,31444,9,15949,4,,,30409,4,261.3,1,16,3291,21,+++++,+++,2720,14,3286,22,+++++,+++,874,7

real    5m38.876s
user    0m0.380s
sys     0m19.645s

Here’s a table of results comparing the write, rewrite and read speeds of each.

Write Rewrite Read
JFS (kernel) 39 25 54
XFS (kernel) 31 16 30
NTFS-3G (FUSE) 31 14 30
ZFS (FUSE) 18 10 28

So that’s pretty conclusive, we have a FUSE filesystem (which also claims to not be optimised) which can pretty much match an in-kernel filesystem.

25 thoughts on “Comparing NTFS-3G to ZFS-FUSE for FUSE Performance

  1. Maybe XFS would look better on a drive array more powerful than a RAID-1?

    The people who use XFS for serious things seem to generally have arrays of 5 or more disks, and having many dozens of disks isn’t uncommon.

    With a RAID-1 every synchronous write will block all IO while on a RAID-5 with a suitably advanced controller (IE not Linux software RAID-5) or RAID-10 will allow multiple writes to occur at the same time.

    Of course most of the Bonnie tests won’t show this unless you use the -p/-y options.

    One thing that would be really interesting to see would be test results from the same filesystem in both FUSE and kernel versions. If someone made Ext3 run via FUSE it would be very useful to measure the FUSE overhead.

  2. Well, first of all it’s all I’ve got to play with and I need the resilience. All the filesystems in question are handicapped in the same way. 🙂

    One thing I’ve found recently is that with 4 P and 8 P arrays you can avoid the read-before-write RAID5 problem, so it can make sense to do RAID-0 striping across multiples of those rather than have a single large RAID5 array that could end up having to read some drives to work out the new parity before writing the stripe.

  3. Very interesting. This is one of the first useful benchmarks on ZFS I’ve seen although its the FUSE port. Given the slight difference of NTFS-G3 and XFS while the difference between JFS is far higher I’d love to see this all compared to Ext3 on your hardware. The reason is that I’m used to Ext3 performance. Does that make sense to you? 🙂

    Btw, the NTFS-G3 website does have a benchmark on NTFS-G3 and there it was already proven it was basically… very fast for a userspace FS.

  4. The benchmark posted on Feb 21, 2007 on the ntfs-3g website uses version 1.0 of the driver; as can be seen, versions 1.3xx and more recent implement a new algorithm for cluster location. Said algorithm has the advantage os fragmenting big files much less, of copying those same files much faster and of reducing CPU and memory use.
    My personal tests seem to indicate that ntfs-3g has a bigger CPU use than the Microsoft ntfs driver, and is also more demanding than, say, ext3. However, it is much better at not fragmenting files than the Microsoft native file system (compared with latest Vista version).
    Areas of optimization could be found in better scaling (right now, it works very well on 40 GB partitions, but when I copy stuff on a 120 Gb partition, it really drags on my CPU) and more automated file management (automatic detection of FS code page). Further optimization could be found in, say, automatic block allocation of frequently modified files at top of drive, and dynamic priority changes depending on system load and FIFO fill rate.
    While it is compatible with Windows logical RAID partitions and is able to mount them under Linux, it has seen no optimization that I know of in this area; the block allocator could be (if it isn’t already) threaded so as to accelerate reads and writes on logical RAID setups.
    Don’t get me wrong, I think the latest version (1.516 at the time of writing) is fantastic – but when the authors mention that the driver could be optimized, please consider that a file system is almost like a cat; there are multiple ways to skim one.

  5. Just for the record, what’s the time for extracting a recent kernel tarball? Filesystems do kinda funny things when it comes to these …

  6. Hi Jan, I’d played with doing that when I first started playing with ZFS but hadn’t revisited it since, plus I was extracting from a compressed archive which added some extra overhead.

    So here are some new numbers for 2.6.19.1 for ZFS (with and without compression), XFS and JFS.

    Extracting with cat /tmp/linux-2.6.19.1.tar | time tar xf -

    XFS: 40s
    JFS: 27s
    ZFS without compression: 86s
    ZFS with compression: 77s

    Removing the resulting source code tree with time rm -rf linux-2.6.19.1

    XFS: 19s
    JFS: 25s
    ZFS without compression: 20s
    ZFS with compression: 15s

    There you go!

  7. I’d guess JFS does not have barrier support (hence appearing faster). You could compare mount -t xfs -o nobarrier. How fusezfs does (or not) barriers, no idea 🙂

  8. Hmm, all my XFS file systems report things like:

    Filesystem "dm-9": Disabling barriers, not supported by the underlying device

    Would using the nobarrier mount option make a difference in that case ?

  9. Hi Jan,

    I guess what I’m asking is that would there be any performance difference between me explicitly disabling write barriers at mount time and XFS disabling write barriers itself at mount time because the underlying devices don’t support them ?

  10. If you can prove that explicitly disabling write barriers gives different performance to having write barriers disabled because the device doesn’t support them then that would be a bug. It seems quite unlikely that there would be such a bug, but if you find such a bug then please report it.

    Apparently one of the worst things that can happen is when a device claims to support write barriers but doesn’t actually do so. Then the filesystem driver jumps through hoops to manage write barriers (with some performance cost) but the reliability is not provided.

  11. I’m just saying, if you run a benchmark, either make sure that every fs got write barriers on (and working), or off, to get comparable results.

  12. Pingback: hotsolaris

  13. Nice to see these benchmarks! 🙂

    NTFS-3G is indeed completely unoptimized. I shortly mentioned the current major
    bottlenecks on http://lwn.net/Articles/238812/ but of course there are many more.

    The similarities between the performance of XFS and NTFS-3G could be that I use
    XFS and currently that’s the “baseline” performance during NTFS-3G development.
    Anything much worse than that is considered to be a usability bug, not a performance
    problem, so it gets more attention.

    The comparison is valid when I/O is the real bottleneck, not the CPU. If the
    processor is too slow compared to the disk (e.g. embedded devices or high
    I/O throughput servers) then user space file systems will suffer a lot
    since the performance support infrastucture in the kernel and FUSE isn’t
    developed, optimized yet.

    The new, unfinished, unoptimized ntfs-3g block allocator since version 1.328
    helps if the volume is fragmented. I also noticed that Microsoft’s NTFS block
    allocator is fairly inefficient.

    About the “scalability” (40 -> 120 GB disk experience). I think the reason for the
    high CPU usage is what I mentioned above: the bigger the disk the faster, so the
    CPU can be used more which results higher CPU usage. Thanks Mitch74 for the
    ideas.

    Automatic detection of FS code page: highly OS and environment specific. If the
    distribution, OS vendor sets it up properly before mounting an NTFS volume then
    the driver will work fine without the usage of ‘locale=’ workaround mount option.

    As for zfs-fuse, since data should stay in the kernel hence some of zfs code
    should also go there (e.g. end-to-end checksumming).

  14. Hi Chris,

    I don’t think the amount of code which needs to be rewritten under a different licence for optimization to be included in the kernel would be significant compared to a full rewrite which is practically impossible and would take about “forever”. Let’s say full rewrite vs optimization: 100,000 vs 1000 lines of code. In fact, some of the code which is needed to be in the kernel are already under GPL2 (e.g. checksum, compression).

  15. That’s very interesting. Sounds like we need to see about properly optimising the FUSE port then! 🙂

    Of course, having a proper in-kernel implementation would still be the best solution as then you’d be able to boot of of ZFS and therefore use it exclusively. Still, it’d be nice to be able to store at least my data files on a ZFS partition.

  16. you can boot from an NTFS partition using ntfs-3g, because the kernel can read NTFS by default (thus load its image, then mount the boot partition read only, then load ntfs-3g, then remount it as read/write). If there is a read-only implementation of ZFS in-kernel, then it can already boot.

  17. No need for in-kernel NTFS support to boot Linux from NTFS. Grub supports this natively (ZFS too) and NTFS-3G implemented bmap which is needed by LILO. Several distributions use NTFS-3G for root file system, for example WUBI (Windows Ubuntu Installer). When the kernel booted with an initrd or ramfs then it can mount a file system and pivotroot to it to be the root file system.

    There are some minor issues e.g. during shutdown in which order the subsystems and processes are terminated but they are solvable and being worked on.

  18. Er, booting from ZFS/FUSE under Linux already works folks!

    Riccardo commented on that feat saying:

    As a side note, it’s interesting to know that Linux is the first operating system that can boot from RAID-1 0, RAID-Z or RAID-Z2 ZFS pools (Solaris can only boot from single-disk or RAID-1 pools)

    🙂

  19. Pingback: Between the Lines mobile edition

  20. Pingback: How to implement ZFS on FUSE | techinterplay

  21. Pingback: Tried ZFS on Linux? – LINUX For You Magazine

  22. Pingback: Super fast Ext4 filesystem | c1p1

Leave a Reply to How to implement ZFS on FUSE | techinterplay Cancel reply