pbzip2 isn't faster

Ryan Schmidt ryandesign at macports.org
Fri Apr 4 02:49:04 PDT 2014


On Apr 4, 2014, at 03:33, René J.V. Bertin wrote:

> On Apr 04, 2014, at 10:19, Ryan Schmidt wrote:
> 
>> While waiting minutes for clang-3.5 to install (compress) and then activate (decompress) a 600MB archive, I wondered why I was sitting here waiting for a single-threaded process to complete when I have a multi-core Mac.
> 
> Is that 600MB raw, or 600MB when compressed?

610 MB compressed. clang is enormous.


>> Has anybody successfully achieved the promised parallel operation of pbzip2 on OS X? If so, I wonder if it depends on the OS X version or the compiler used. I’m on OS X 10.9.2 with Xcode 5.1’s Apple LLVM version 5.1 (clang-503.0.38) (based on LLVM 3.4svn).
> 
> The correct question to ask is for what cases pbzip2 is faster, if any ... A compressed file is essentially a 1D string that's not segmented like multimedia data (how common is it to use multiple threads to [de]compress audio?). I may be wrong, but for now I'm not at all amazed that parallelisation of uncompressing such data entails a lot of overhead, esp. if it also means letting the disk seek so many times more

The homepage says "PBZIP2 is a parallel implementation of the bzip2 block-sorting file compressor that uses pthreads and achieves near-linear speedup on SMP machines."

If this didn’t actually work, there would be no reason for the program to exist, but it has, for over a decade.

> (have you tried to compare to [de]compress from one disk to another, or using an SSD?)

I am using an SSD. I have also tried decompressing to /dev/null, with no change in speed.

> Also, decompression tends to be so much cheaper than compressing that the parallel overhead will count even more.



More information about the macports-users mailing list