[mpbb] branch master updated: Add smarter distfile mirroring script

Ryan Schmidt ryandesign at macports.org
Sun Mar 11 04:09:08 UTC 2018


On Mar 10, 2018, at 07:48, Joshua Root wrote:

> On 2018-3-10 16:12 , Ryan Schmidt wrote:
>> 
>> Well I did cancel it... Looking at the log, the list of ports isn't being deduplicated. Mirroring ncurses was attempted 506 times, gperf was attempted 687 times, etc.
>> 
>> https://build.macports.org/builders/jobs-mirror/builds/1212/steps/mirror/logs/stdio
>> 
>> The next run looked better, with 21 more ports being marked completed in the mirrorcache within the first 3 minutes. But then nothing further for 20 minutes. I cancelled that too so I could look at the log:
>> 
>> https://build.macports.org/builders/jobs-mirror/builds/1213/steps/mirror/logs/stdio
>> 
>> It's still mirroring the same ports multiple times, even ports like libiconv that are marked in the mirrorcache.
>> 
>> I'm going to cancel the remaining mirroring jobs for this commit. We'll be able to debug this better looking at logs from commits of fewer ports.
> 
> That should be fixed. It was again only a problem in the case of
> failures, which are apparently going to be common. Stuff like this is a
> real problem for figuring out if a port should be considered
> successfully mirrored or not:
> 
> --->  Fetching distfiles for clang-3.4
> Error: clang-3.4 is not supported on macOS Sierra or newer.
> Error: Failed to fetch clang-3.4: unsupported platform
> 
> --->  Fetching distfiles for xattr
> Error: xattr cannot be installed for the configured universal_archs
> 'x86_64 i386' because it only supports the arch(s) 'i386 ppc'.
> 
> - Josh

There's still a performance problem. Here's a completed run after Takeshi updated wgrib2, which took 2 hours.

https://build.macports.org/builders/jobs-mirror/builds/1229/steps/mirror/logs/stdio

The next run for an update to esmf took just as long; it didn't remember that it has already tried many of the same ports last time:

https://build.macports.org/builders/jobs-mirror/builds/1230/steps/mirror/logs/stdio

-----

There's a problem where it's trying to fetch ports that don't use fetch.type standard. This results in fetch dependencies being installed, which could fail:

netpbm with platform 'darwin 8 powerpc'
--->  Cleaning netpbm
--->  Computing dependencies for netpbm
--->  Dependencies to be installed: subversion apr-util db46 libtool expat sqlite3 libedit curl-ca-bundle perl5 perl5.26 gdbm readline cyrus-sasl2 kerberos5 libcomerr pkgconfig openssl zlib python27 bzip2 db48 libffi python2_select python_select libmagic serf1 scons
--->  Extracting libtool
--->  Applying patches to libtool
--->  Configuring libtool
--->  Building libtool
--->  Staging libtool into destroot

Error: Failed to destroot libtool: command execution failed
Error: See /opt/local/var/buildmaster/build/prefix/var/macports/logs/_opt_local_var_buildworker_jobs_jobs-mirror_build_ports_devel_libtool/libtool/main.log for details.

-----

The only reason you're running fetch is because you need to run checksum (which runs fetch) because mirror does not indicate when checksum verification failed. That sounds like a bug to me; I filed https://trac.macports.org/ticket/56008

-----

There's a nasty problem where files whose checksums have not been verified still end up on the mirror server. A patchfile's case was specified incorrectly in netcdf. (I've since fixed that.) When mirroring, the script, noticing the patchfile was missing, because the server is on a case-sensitive filesystem, tried to download it from master_sites. Because the port uses the github portgroup with its default download location, the download of "patch-liblib-CMakeLIsts.txt.diff" "succeeded" (it actually received a file with the same contents as netcdf-c-4.4.1.1.tar.gz).

--->  Verifying checksums for netcdf
netcdf +hdf4
--->  Cleaning netcdf
--->  Fetching distfiles for netcdf

Error: No checksum set for patch-liblib-CMakeLIsts.txt.diff
Error: No checksum set for netcdf-c-4.4.1.1.tar.gz

--->  Fetching distfiles for netcdf

Error: Internal error: 'hdf5' is not an installed port.

--->  Attempting to fetch patch-liblib-CMakeLIsts.txt.diff from https://distfiles.macports.org/netcdf
--->  Attempting to fetch patch-liblib-CMakeLIsts.txt.diff from https://distfiles.macports.org/netcdf
--->  Attempting to fetch patch-liblib-CMakeLIsts.txt.diff from https://github.com/Unidata/netcdf-c/tarball/v4.4.1.1
--->  Attempting to fetch netcdf-c-4.4.1.1.tar.gz from https://distfiles.macports.org/netcdf
--->  Attempting to fetch netcdf-c-4.4.1.1.tar.gz from https://distfiles.macports.org/netcdf
--->  Attempting to fetch netcdf-c-4.4.1.1.tar.gz from https://github.com/Unidata/netcdf-c/tarball/v4.4.1.1
--->  Verifying checksums for netcdf

Error: No checksum set for patch-liblib-CMakeLIsts.txt.diff
Error: No checksum set for netcdf-c-4.4.1.1.tar.gz
Error: No checksum set for patch-liblib-CMakeLIsts.txt.diff
Error: No checksum set for netcdf-c-4.4.1.1.tar.gz
Error: Failed to checksum netcdf: Unable to verify file checksums

At this point, the files netcdf-c-4.4.1.1.tar.gz and patch-liblib-CMakeLIsts.txt.diff were on the distfiles server, though obviously it did not verify their checksums because it didn't even think they were specified. (I have since removed the file patch-liblib-CMakeLIsts.txt.diff from the server.)

I don't know why it says checksums are not set for netcdf-c-4.4.1.1.tar.gz. The port does set checksums. Because the port has only one distfile, it does not give the name of the distfile in the checksums line. Maybe that's what this error indicates.

-----

If a port fails to fetch, it tries again many times (once for each variant / os.major+os.arch).

-----

I think the pair of mirroring scripts I wrote 2 years ago (that I didn't publish yet or put into service) don't have these problems. They take a different approach: The first script asks MacPorts for the distfile disk locations, download URLs and checksums (the output of "port distfiles"), and then, for each distfile that is not on disk, pipes that information to a second script which downloads and checksums each file, using new code that does not use "port fetch", "port mirror" or "port checksum". It only tries one time to get any particular file.

I'll commit my scripts to a fork of mpbb so you can see what they do. Maybe this will give us ideas to incorporate into your script, or maybe using my script directly would work.




More information about the macports-dev mailing list