how to deal with large data files

Mojca Miklavec mojca at macports.org
Thu Mar 28 06:05:00 UTC 2019


On Thu, 28 Mar 2019 at 05:12, Joshua Root wrote:
> On 2019-3-28 10:17 , Renee Otten wrote:
> > I am looking for some advice on how to deal with a port that will only
> > download a large data set (~5GB),
> > see https://github.com/macports/macports-ports/pull/3904. I assume we do
> > not necessarily want to have that stored on the MacPorts distfile
> > mirrors… correct? One way to accomplish this, I think, is to make it
> > non-distributable, even though the license would allow so. Is there a
> > preferred way of doing this or are there no concerns with a port like this?
>
> Does this single data file need to be managed by MacPorts at all? Adding
> a note telling users where to download it from and where to put it might
> be fine.

I have a port [gate] (with Qt GUI) which requires a port [geant4]
which requires a port [geant4-data] which requires fetching 10+ data
files with total size of somewhere around 0,6 - 1 GB (I would need to
double-check the exact size). And that package doesn't even build
anything, it just fetches and copies the data (and then ends up as a
huge "package" in $prefix/var/software).

Manually fetching those 10+ files to satisfy a deep dependency sounds
super tedious to me ...

> As a port, even if it's not mirrored, it's still going to be taking up
> gigabytes per OS version on the builders and in the private archives.

... but I would really really like to avoid:
- mirroring the sources
- mirroring the resulting package
- keeping the package installed on the buildbot slave once gate and/or
geant4 have been compiled

No idea how to remove it from the private archive etc., but I would
fully support doing something about it to avoid clogging our build
system.

Mojca


More information about the macports-dev mailing list