SMP implementation of bzip2

Here’s something of a find, courtesy of Jordan Mendler on the ZFS/FUSE mailing list, an SMP implementation of bzip2 called pbzip2 by Jeff Gilchrist:

PBZIP2 is a parallel implementation of the bzip2 block-sorting file compressor that uses pthreads and achieves near-linear speedup on SMP machines. The output of this version is fully compatible with bzip2 v1.0.2 or newer (ie: anything compressed with pbzip2 can be decompressed with bzip2). PBZIP2 should work on any system that has a pthreads compatible C++ compiler (such as gcc). It has been tested on: Linux, Windows (cygwin & MinGW), Solaris, Tru64/OSF1, HP-UX, and Irix.

It’s packaged in Ubuntu (in Universe) and testing on this quad core Intel box (2.4GHz with 4GB RAM) on a 712MB tar file in comparison with the standard bzip2 showed pretty impressive performance!

Standard bzip2 compression:

chris@quad:/tmp$ time bzip2 -v backup-20020122.tar
backup-20020122.tar: 1.531:1, 5.227 bits/byte, 34.66% saved, 746250240 in, 487572628 out.

real 2m32.331s
user 2m29.593s
sys 0m0.976s

Standard bzip2 decompression:

chris@quad:/tmp$ time bunzip2 -v backup-20020122.tar.bz2
backup-20020122.tar.bz2: done

real 0m56.215s
user 0m54.519s
sys 0m1.136s

Parallel bzip2 compression:

chris@quad:/tmp$ time pbzip2 -v backup-20020122.tar
Parallel BZIP2 v1.0.1 - by: Jeff Gilchrist []
[Mar. 20, 2007] (uses libbzip2 by Julian Seward)

# CPUs: 4
BWT Block Size: 900k
File Block Size: 900k
File #: 1 of 1
Input Name: backup-20020122.tar
Output Name: backup-20020122.tar.bz2

Input Size: 746250240 bytes
Compressing data...
Output Size: 487531723 bytes

Wall Clock: 41.335455 seconds

real 0m41.338s
user 2m40.962s
sys 0m2.248s

Parallel bzip2 decompression:

time pbzip2 -v -d backup-20020122.tar.bz2
Parallel BZIP2 v1.0.1 - by: Jeff Gilchrist []
[Mar. 20, 2007] (uses libbzip2 by Julian Seward)

# CPUs: 4
File #: 1 of 1
Input Name: backup-20020122.tar.bz2
Output Name: backup-20020122.tar

BWT Block Size: 900k
Input Size: 487531723 bytes
Decompressing data...
Output Size: 746250240 bytes

Wall Clock: 18.078961 seconds

real 0m18.081s
user 1m3.516s
sys 0m1.776s

So that’s almost a x3.7 speedup over the single CPU version, not bad!

Oh, and yes, there is an MPI version available too, called mpibzip2.. 🙂