Using xz by default for compression

René J.V. Bertin rjvbertin at gmail.com
Mon Jan 26 02:34:11 PST 2015


On Monday January 26 2015 04:06:34 Ryan Schmidt wrote:

> 
> I forgot about that as well. But what compression algorithm does hfscompression use, and why are you so sure it will save "way more disk space than" xz? Have we tested that? Compressed disk images, for example, originally used zlib compression, and as of Tiger support bzip2 compression, but xz would be smaller than both of those. Granted hfscompression was introduced after that, in Snow Leopard, so it may well use a more advanced algorithm, I don't know.
> 

Hard to say for an algorithm that's not supposed to be used transparently and thus can use a less favourable speed/compression trade-off, and the type of compression used by `bsdtar -x --hfsCompression` clearly doesn't. NB: ZFS and btrfs both support different on-the-fly compression algorithms; HFS+ could well do so, too. 

I had the same question as you, for a second. The argument here is of course that you'd want both the highest possible compression for the software archives, and if possible the smallest possible footprint of the activate port. So xz would apply to the archive tarball, and hfs compression will reduce the space used by the active version, thereby reducing the footprint of the entire MacPorts tree (except for those parts containing tarballs).

As an example: the Qt 5.4.0 source tree takes up 1.7Gb normally, impressively down to 400Mb extracted with --hfsCompression, with a 2x slower extraction.

Use performance examples using tcsh' time command:

Without compression, 1st time
#> time fgrep QGenericUnixServices -R qt-everywhere-opensource-src-5.4.0/
snip
> 3.202 user_cpu 21.238 kernel_cpu 1:49.77 total_time 22.2%CPU {0W 0X 0D 0K 6875136M 0F 63303R 6572I 7752O 0r 0s 0k 171188w 36188c}
Repeated immediately
> 1.366 user_cpu 2.381 kernel_cpu 0:04.89 total_time 76.4%CPU {0W 0X 0D 0K 6895616M 0F 67970R 1I 372O 0r 0s 0k 5272w 482c}

With compression: 1st time
#> time fgrep QGenericUnixServices -R qt-everywhere-opensource-src-5.4.0/
> 1.875 user_cpu 36.627 kernel_cpu 1:54.56 total_time 33.5%CPU {0W 0X 0D 0K 6834176M 49F 72987R 26468I 13928O 0r 0s 0k 74067w 14795c}
Repeated immediately
> 1.618 user_cpu 15.632 kernel_cpu 0:46.32 total_time 37.2%CPU {0W 0X 0D 0K 6846464M 0F 71258R 25940I 5O 0r 0s 0k 48798w 7706c}

So there is little to no performance hit when reading from disk as long as the system isn't CPU bound, but for some reason the file/disk cache is less effective.

R.


More information about the macports-dev mailing list