Thoughts on switching our archive compression method

Ryan Schmidt ryandesign at macports.org
Wed Nov 11 17:25:36 UTC 2020


On Sep 22, 2020, at 05:58, Ryan Schmidt wrote:

> It would be nice if we could easily switch our precompiled archives from bzip2-compressed tarballs (tbz2) to better compression methods as they become available. For example, xz-compressed tarballs (txz) would be better today. OS X 10.9 Mavericks and later has built-in support for xz compression (https://trac.macports.org/ticket/56237) so we could use that to avoid needing to use the xz port or needing to bundle a copy of xz (https://trac.macports.org/ticket/52000).
> 
> Currently, users have to specify what archive type they want for each server listed in archive_sites.conf. The default archive_sites.conf fortunately has everything commented out, so users get the MacPorts built-in defaults, so we could change the behavior in the future and users would automatically get it.
> 
> We could remove the compression type setting in archive_sites.conf. (Or there could be a new value of the compression type setting that means to do the following.) And instead of having an rmd160 file for each archive, have an information file which could be encoded in any convenient format such as json or plist or whatever we're using for the PortIndex. MacPorts would fetch the information file first, and it would contain a field stating the compression format (filename extension) for this archive and a field for its rmd160 signature. From this MacPorts can deduce what archive filename to download. It would also give us a place to store any additional information about archives that we might want to store in the future.
> 
> If that were in place, along with support in base for decompressing xz without needing the xz port, then we could begin compressing new archives for 10.9 and later with xz without needing to recompress the entire collection of existing archives. Over time we can do that as well, of course.

I wasn't so much asking whether we should switch archive types; I had taken it as read that we had decided years ago that yes we did want to switch to something with better compression but that making the switch would be difficult. My intention in this thread was to propose a method for making that switch less difficult, by enabling us to switch on an archive-by-archive basis, and I was hoping to get feedback on whether anyone sees any problems with the proposal.

Even if we don't want to switch our own archives, users might set up their own collections of archives and want to change their format; see https://trac.macports.org/ticket/60269 for a case of that.

To maintain compatibility with both old and new MacPorts versions, for a time we would have to generate both the rmd160 file and the new info file, but that's easy to do and the files would be small. We would also have to generate new info files from the current rmd160 files for all existing archives.

One problem is that the archive stays on the user's computer and MacPorts needs to know how to decompress it when the port is activated. Ideally we would solve that by implementing a long-standing feature request of giving MacPorts the ability to know how to decompress any file automatically without needing to be told compression type or what commands to use. This will have other benefits like letting MacPorts automatically decompress all distfiles of a port even if they have different compression formats.




More information about the macports-dev mailing list