XFS, JFS and ZFS/FUSE Benchmarks on Ubuntu Feisty Fawn

Having upgraded to the Feisty beta I thought it would be fun to see what (if any) affect it had on filesystem performance (especially given my previous aide memoir).

For these tests I stuck to my 3 favourites, JFS (from IBM), XFS (from SGI) and ZFS (from Sun, ported to Linux using FUSE by Ricardo Correia due to Sun’s GPL-incompatible license). This is a follow on from a slew of earlier ZFS & XFS benchmarking I did reported on previously (( here, here, here and here )).

Summary: for Bonnie++ JFS is fastest, XFS next fastest and ZFS slowest and Feisty made XFS and ZFS go faster (didn’t record my previous JFS results sadly).

The fact that ZFS is slowest of the three is not surprising as the Linux FUSE port hasn’t yet been optimised (Ricardo is concentrating on just getting it running) and is also hampered by running in user space. That said it still manages a respectable speed on this hardware and does have useful functionality that makes it useful to me.

So, having said all that, here we go with the benchmarks!

First off was XFS with atime set, this means that every time a file is accessed the time of that access is recorded against the files inode. This means that you have to do an occasional write each time a file is read to keep that time correct.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           31444   9 15949   4           30409   4 261.3   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3291  21 +++++ +++  2720  14  3286  22 +++++ +++   874   7
inside,2G,,,31444,9,15949,4,,,30409,4,261.3,1,16,3291,21,+++++,+++,2720,14,3286,22,+++++,+++,874,7

real    5m38.876s
user    0m0.380s
sys     0m19.645s

Pretty respectable, and interestingly though the whole test took less time than previous ones the output and input speeds are less. The difference is that the file create and delete speeds have improved dramatically.

Now we look at XFS with noatime set, avoiding the write I mentioned previously.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           31176   8 16382   4           31142   4 249.9   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3202  21 +++++ +++  2713  16  3125  22 +++++ +++   852   7
inside,2G,,,31176,8,16382,4,,,31142,4,249.9,1,16,3202,21,+++++,+++,2713,16,3125,22,+++++,+++,852,7

real    5m36.508s
user    0m0.324s
sys     0m19.489s

Much the same, with a small increase in the block read speed.

So how does JFS compare ? First time we run the test with atime set.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           39388  11 24979   5           53968   6 255.4   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  9032  24 +++++ +++  8642  33  2621  18 +++++ +++   993   5
inside,2G,,,39388,11,24979,5,,,53968,6,255.4,1,16,9032,24,+++++,+++,8642,33,2621,18,+++++,+++,993,5

real    4m0.982s
user    0m0.292s
sys     0m17.201s

A cool 90 seconds faster than either XFS run, with much better performance all round except for the random create test!

Now what about JFS with noatime ?

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           37431  11 25047   5           55065   7 307.3   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 10794  30 +++++ +++  8107  29  2006  15 +++++ +++   924   4
inside,2G,,,37431,11,25047,5,,,55065,7,307.3,1,16,10794,30,+++++,+++,8107,29,2006,15,+++++,+++,924,4

real    3m59.514s
user    0m0.380s
sys     0m17.417s

Again marginal difference, if any, over the run with atime. So basically I’m going to stick with atime from now on!

So those are the two in-kernel file systems that I use regularly, what about ZFS ?

First of all I ran the test using the binaries I built previously under Edgy when the 0.4.0 beta 1 came out, so the only differences would be related to FUSE and the kernel.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           18317   4  9642   3           28605   4 165.8   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3047   6  7038   8  3227   5  2603   4  6810   8  2607   4
inside,2G,,,18317,4,9642,3,,,28605,4,165.8,0,16,3047,6,7038,8,3227,5,2603,4,6810,8,2607,4

real    8m7.870s
user    0m1.056s
sys     0m16.681s

Now that’s over2 minutes faster to run the same test (last one was 10m 12s)! Most obvious differences are large increases in the rewrite and read speeds.

Next I rebuilt the ZFS/FUSE binaries with the Feisty GCC 4.1.2 packages to see if that would have an effect.

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           18148   4  9957   3           28767   3 164.4   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  3093   6  6201   8  3090   4  2906   5  8592   9  4165   6
inside,2G,,,18148,4,9957,3,,,28767,3,164.4,0,16,3093,6,6201,8,3090,4,2906,5,8592,9,4165,6

real    7m59.385s
user    0m1.140s
sys     0m16.201s

Marginal differences, the fact that this is running in user space could just mean that it was some other system activity getting in the way.

Next I tried the Gentoo approach, optimising them to hell and back with the GCC options “-march=pentium4 -msse2 -mfpmath=sse,387” to see if that helps at all (there is some maths in there as ZFS checksums your data on disk to catch problems).

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
inside           2G           18283   4  9881   3           28210   4 166.4   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  2996   5  6638   6  2531   4  2888   5  6957   9  3077   4
inside,2G,,,18283,4,9881,3,,,28210,4,166.4,0,16,2996,5,6638,6,2531,4,2888,5,6957,9,3077,4

real    8m3.743s
user    0m0.880s
sys     0m16.609s

Went backwards a bit then, but not enough to be significant. Certainly didn’t go any faster!