Python ports: should we use wheel files?

Mojca Miklavec mojca at macports.org
Sun Apr 1 08:43:53 UTC 2018


On 31 March 2018 at 12:11, Enrico Maria Crisostomo wrote:
> Hi all,
>
> While updating the py-tensorflow port to 1.6.0 I started to think about this question: we shortly discussed about it on the corresponding PR (https://github.com/macports/macports-ports/pull/1499) and decided to bring it up in the mailing list.
>
> Safe harbour statement: I'm a Python noob.
>
> The definition of a wheel file is the following:
>
>> Wheel (in this context) is a project that adds the bdist_wheel command to distutils/setuptools. This produces a cross platform binary packaging format (called “wheels” or “wheel files” and defined in PEP 427) that allows Python libraries, even those including binary extensions, to be installed on a system without needing to be built locally.  In the case in point: one TensorFlow dependency has a native component that needs to be built locally.
>
> Now, the problem.  macOS is a platform where TensorFlow is _built_ and _tested_ by upstream.

Citing upstream:

> Although these instructions might also work on other macOS variants, we have only tested
> (and we only support) these instructions on machines meeting the following requirements:
>
> macOS X 10.11 (El Capitan) or higher

And https://pypi.python.org/pypi/tensorflow contains files like

    tensorflow-1.7.0-cp27-cp27m-macosx_10_11_x86_64.whl

Just guessing what 10_11 means ...

On the extreme part of the spectrum some of our maintainers still try
to support (and test on) 10.4, but generally nearly all software still
works pretty well on 10.6. If we switch to binaries, we lock out all
the users of OSes lower than whatever system upstream deems worth
supporting (or usually: has access to).

An additional question is: can you fully trust the binary code of all
the hundreds of python packages? (It's also true that you should not
always trust the sources until you checked them, but binaries are
easier to infect without noticing.)

>  But I'm wondering, for the sake of argument, whether a parallel can be drawn between
> (Python, wheel file, PyPi) and (Java, *AR files, Maven Central).

No.

> After all we already ship Java byte-code without recompiling it, and sometimes Java projects
> ship precompiled native libraries.

The difference is that a Java binary works *everywhere*. You can put
it to Windows, Linux, Mac, Solaris (I guess), ... and the same code
will just work due to an additional layer that is executing those
instructions. The additional problem is that Java on Mac is a bit of a
can of worms and compiling it in a proper way is currently way less
trivial that compiling a C[++] project.

The same is not true for the C code from Python packages. It will
generate different code depending on compiler / OS / ... and if you
get a wheel file, there's no guarantee that you'll be able to execute
it.

I also find it pretty strange that those binaries are not
provided/compiled by, say, pypi, but need to be manually uploaded by
developers. I would have more confidence in the binary files if they
were all built on a central server from source.

> But in the last few months I couldn't but observe that the general guideline given by most Python projects I'm using is
> (a) installing Python packages (using pip and/or wheel files) or
> (b) even build a virtual env (again, using a package-driven process).

Outside of the package manager this is of course still a very suitable tool.

Mojca


More information about the macports-dev mailing list