[MacPorts] #67336: BSD tar can create corrupted archives on Catalina, Big Sur, Monterey, Ventura

MacPorts noreply at macports.org
Fri May 5 02:13:23 UTC 2023


#67336: BSD tar can create corrupted archives on Catalina, Big Sur, Monterey,
Ventura
---------------------+-------------------------------------------------
  Reporter:  catap   |      Owner:  (none)
      Type:  defect  |     Status:  new
  Priority:  Normal  |  Milestone:
 Component:  base    |    Version:  2.8.1
Resolution:          |   Keywords:  catalina, bigsur, monterey, ventura
      Port:          |
---------------------+-------------------------------------------------

Comment (by ryandesign):

 I do not think we should be attempting to switch the archive type of
 individual ports to avoid this problem. As [comment:11 I said], we never
 foresaw the need for such a thing so there is no sanctioned way to do that
 in MacPorts base. `portarchivetype` is not in the documented list of
 options you can set in Portfiles; it's documented as something you can set
 in macports.conf. As [comment:13 I said], I didn't know what the
 consequences  switching `portarchivetype` in ports might be, and
 [comment:19 now we see] that a consequence is that the user cannot receive
 a binary from our servers.

 Further research tells me the issue is very intermittent, possibly relates
 to the creation of APFS [https://eclecticlight.co/2021/03/29/sparse-files-
 are-common-in-apfs/ sparse files], and that LLVM tools create such sparse
 files. I'm not sure what was meant by "LLVM tools", but if it means
 anything that uses LLVM, that would include clang, and we've certainly
 seen the problem with compiled software created by clang like the `mongod`
 binary and the pari library. The issue has also been seen on systems as
 far back as Catalina. I did not find a report of the problem on Mojave or
 High Sierra, but they can use APFS too, so the problem might affect them
 as well. Therefore, the issue has the potential to affect thousdands of
 our ports, at random, on several years worth of OS versions. And yet
 you've just brought the problem to our attention now. That means the
 problem occurs infrequently enough that we didn't notice it before and
 that just rebuilding the port, without any further changes, would probably
 have built successfully, although of course even if it rebuilt
 successfully on our buildbot servers then, it might build incorrectly for
 some users, though hopefully most users receive our binary instead of
 building from source.

 In any case, we need a general solution in MacPorts base so that we don't
 sprinkle worarounds into thousands of ports, especially not a workaround
 that removes the ability to receive a binary. Running `/usr/sbin/purge` to
 purge the disk cache before creating the archive is one solution that
 people say work. It requires root, but that's not a problem for most
 MacPorts installations. Another workaround that doesn't need root is
 running `/bin/sleep 10`; this seems to give the disk cache enough time to
 sort itself out without a purge. MacPorts base could use one or the other
 of those, depending on if it's a root installation of MacPorts or not. We
 could probably limit it only to APFS filesystems, and maybe only those
 ports where `supported_archs` is not `noarch`, however it sounds like
 sparse files are increasingly common on macOS so even `noarch` ports might
 be creating them. There is an API on macOS 11 and later for determining if
 a file is sparse, so at least on macOS 11 and later we could traverse the
 destroot and only use `purge` or `sleep` if there were any sparse files.

 If we are finding that certain ports are for some reason much more
 susceptible to the problem, we could add the workaround into those ports
 until a new release of MacPorts takes place.

 As you mentioned in your original report, only BSD tar seems to be
 affected by this; GNU tar is not. I haven't tried to read the code of
 either of them, but if there is some code that GNU tar is running that
 avoids the issue, such as perhaps some preprocessing on all the files that
 it will add to the archive that coincidentally causes the disk cache to
 figure itself out, maybe we can use similar code in MacPorts base to avoid
 the need for either `purge` or `sleep`.

-- 
Ticket URL: <https://trac.macports.org/ticket/67336#comment:21>
MacPorts <https://www.macports.org/>
Ports system for macOS


More information about the macports-tickets mailing list