The static library discussion

Marin Saric marin.saric at gmail.com
Thu Dec 22 07:38:12 PST 2011


Hi Jeremy,

Just a general remark before answering some of the interesting
subpoints you make. A case for supporting static library deployment on
a Unix ports project does not need to collide with supporting dynamic
libraries. It sounds in your arguments as if you are imagining
everyone linking against only static libraries. I can definitely
imagine a mess if this were to happen, which is what dynamic libraries
were designed to avoid in the first place.

But I believe there's cases when it's very useful. What remains open
is whether MacPorts wants to support static deployments or not and
hopefully some more discussion in this thread will resolve this.

If MacPorts decides it doesn't support static libraries, they could be
disabled for a ton of ports and it would cut down the compile times
for all the ports that were made to provide both.

I also would like to understand more the negative impact of supporting
static libraries better.

So far I have this:
 - If MacPorts allows a port A to link against a static library B and
that library B later gets updated, the ports dependent on B won't
update, and it's complicated to determine whether they're outdated or
not. If there are outdated ports, they will all need to be recompiled,
which after some point will become unmanageable.
  - Static libraries don't advertise versions. If a user reports a
crash in B, we will not know which version of B caused the crash. This
is something easily fixable by MacPorts. I believe that having
MacPorts creating a small log file of what port versions were used to
produce a target would be useful both for static libraries and dynamic
libraries and, of course, for binaries. As such, MacPorts specific
dependency versioning information, such as revision numbers could be
preserved for dynamic libraries as well.

On Thu, Dec 22, 2011 at 7:00 AM, Jeremy Huddleston
<jeremyhu at macports.org> wrote:
> On Dec 21, 2011, at 16:33, Marin Saric wrote:
>> It's source level API is consistent, but depending on
>> what the version of x264 you download (new "snapshots" on something
>> like a daily basis), your program is going to link against different
>> symbols. In the newest version of x264 the ABI broke again.
>
> Ok, so x264's development model is broken upstream.  Go yell at them to do better.  This is not a "MacPorts" issue.  It's an x264 issue, and that impacts all distributions, not just us.

I won't go and yell at anyone about this, not my style ;)
I agree that the above behavior definitely shouldn't be encouraged, I
just didn't know MacPorts was in the business of enforcing an
engineering style/philosophy on other open-source projects, but this
thread will hopefully clarify this.

>> Assume x264 is dynamic. Now assume a set of dynamic libraries H export
>> symbols against an older ABI version of x264. Binaries depending on H
>> and binaries dynamically linking against libraries depending on H all
>> break once x264 is replaced with a new version.
>
> AIUI, they do at least bump the dylib version, right?  So libx264.10.dylib becomes libx264.11.dylib when there is ABI incompatibility.  That, while sub-ideal, is still tolerable.  We dealt with this on a larger scale with libpng going from 1.2 to 1.4 and then again to 1.5.  All dependent

OK, good to know this has showed up before. This is ticket 28029, right?
What I see is that it was dealt with was the same like suggested for
x264 right now, by bumping the version number and recompiling.

Was there going and yelling at libpng developers for this? :-)
Or was it OK because the version bump was from 1.2 to 1.4?
Does anyone know any explicit standards for ABI compatibility that are
based on version numbers?
I thought that generally it's up to the library developer to come up
with their own policy of when to break the ABI and when not to.

>> The old symbols are
>> not found or, even worse, they are found but their semantics or
>> interfaces have changed in a subtle way causing malfuction.
>
> That SHOULDN'T ever happen. If they change their ABI in this way, they need to bump the library version.  We should not do something like ship static libraries to work around their bad development practices.  If they ever do this, we should not ship their code until they fix it.

So there is two aspects to this:
- Am I suggesting MacPorts should ship a static library as a workaround?
No, I think MacPorts should ship both and let the other open source
developers choose which one to use. Because of this issue, I'd always
link statically against x264 on my project but others can choose at
will.

 The developers offer a static library along with the dynamic library.
If I, as an open source developer knowing the style of these guys want
to not worry about the version changes, I just link my library or
binary against the static version of they library, which they also
provided for. Of course I can't do that if MacPorts does not want to
support static deployments.

>> Hence the
>> "bump_dev_revs.sh" script, forcing a recompile of everything that
>> depends on x264. If MacPorts were a binary-only distribution, we'd be
>> in trouble.
> No, if we were a binary-only distribution, we'd be set because we could have multiple instances of libx264 installed at the same time (one for each needed ABI version) ... just like every other binary distribution out there ... even source distros like Gentoo use SLOTs to deal with this.

That was my point, that it takes more engineering in MacPorts to
support multiple coexisting library versions. Clearly this model would
have a problem in binary-only distribution. We appear to agree on
this.

> This issue does not impact just us.  Any distribution is impacted by this and if they really ship two releases where the library versions match but the ABI don't, that's a huge bug on their end.  If they don't see it as such, they need to be educated.

It's the opposite, the x264 library version never really matches. Each
one is essentially marked by the date of their snapshot and there are
a ton of them on their website. I think internally they have some sort
of revision number for ABI that they keep bumping up, but I am not
familiar enough with their strategy.

This is exactly what I wanted more feedback on from this thread.
Whether MacPorts is in the business of educating others on how to do
things "the-right-way(TM)" or provided multiple ways to just get a
port done quickly, provide new versions to users and call it a day.

>> One "bad apple" can wreak havoc on the whole system. Note that this is
>> different than the symbol-collision problem fixed by two-level
>> namespaces.
>
> Yeah, so fix the bad apple.  Don't wreck the system to lessen the impact of the bad apple.

I want to understand more about why is providing a developer a chance
to link against your library statically wrecking the system? I am not
arguing for switching everything to static, that will not work.

But will MacPorts allow other projects to link statically?
It's really convenient to rely on MacPorts as a build system.

>> Example 2: "Feature freeze"
>> If you want your binary to just depend on a specific version of a
>> library that you have validated against instead of the newest latest
>> and greatest, you will link against a static version of it.
>
> Or if you're trying to support a distribution and someone reports a bug occurring in some media player which is crashing in its x264 code, and that was provided by a static library, there's no way to know what version of the static library is present.

That's definitely a valid point.
However, MacPorts could keep a log during compile time about the
versions of all of the port dependencies and store that with a port. I
don't think this would be very difficult to script at all. Even
dynamic libraries wont tell you the MacPorts revision number. Some
dynamic libraries won't report the version anyway, for example tbb and
eigen.

I think this would be a useful debugging/development feature in general.

> Another way of looking at that is that if x264 is bumped multiple times, the binary package of VLC or mplayer or ffmpeg won't be consistent across those releases.  We'll need to bump the revision of those ports to rebuild against the newer x264 to pick up the bug fix, whereas they would "just work" (modulo ABI changes) if you just used dynamic libraries.

I think this is a legitimate issue and it can quickly become a
maintenance headache.

>>  On MacOS X
>> you can put stuff in your own framework or have it live in an
>> application bundle. Outside the world of MacOS X, linking against a
>> static library is a way to do it.
>
> Uhm... what?

Does the "uhm... what"  mean that I wrote something unclear or that
you are expressing a disagreement with something I wrote?

> "Outside the world of Mac OS X" this is exactly the same issue.  A Framework is just a dylib and a bunch of headers.  Strip away the packaging, and it's exactly the same thing.

I was referring to the fairly common practice of deploying your own
versions of dependent libraries inside an application bundle on MacOS
X, as opposed to expecting to pick it up from a system directory.

>> In a perfect world, you'd trust the library developer not to break the
>> desired behavior by adhering to a consistent versioning. You might
>> also expect the newer versions of the library never to introduce a
>> regression.
>
> If there's a regression, report it and fix it.  This is no different than anywhere else.

This has more to do with an impact of a regression on other already
deployed libraries and programs that worked so far. It's always nice
to report bugs and/or fix them.

>> Or you might just say, OK, I am happy with the
>> functionality provided by the current version of the library and I am
>> OK with statically linking against it.
>
> If you're happy with that version, then stick with that version by providing a local overlay of that port which is stuck at that version.  Using a static library just makes it exceedingly difficult to manage, leads to wasted memory due to duplication of this library across processes in a way that can't be shared, wasted disk space due to additional duplication, and is generally BAD software engineering.

What exactly do you mean by a local overlay? Providing the dependent
library inside the source package of the application/library that
depends on it? What if that library depends on other libraries? Do I
have to provide source for them too?

I am not convinced deploying static binaries is bad software
engineering. But, it might be bad for a source-level distribution.
There are worse sins to make than spend a few hundred kilobytes or
even megabytes of memory when there are gigabytes available and thus
speed up the application load time and reduce the number of
dependencies on other people's code.

>> Example 3: Building plugins - avoiding versioning conflict.
>> This is a test case I am dealing with right now.  I have binaries that
>> open my plugin through the dlopen API. These binaries are outside of
>> MacPorts and they depend on their own private versions of freetype and
>> some other libraries. Unfortunately they work in a flat namespace
>> model and thus things break.
>
> Why?  Fix their misuse and reliance on the flat namespace.

They're not open-source.

Even they were, sometimes making these fixes is much more tedious than
providing a static build.

>> Static library is a simple fix.
>
> I think you mean "messy workaround" rather than "simple fix"

I don't see it as a workaround, but as a feature freeze. Often, in a
deployment I do not want to theorize about every single of the billion
libraries my big application will depend on and just say, OK, this is
what I have, this is what works and this is how it will stay until I
release a new binary. This way I don't make my users volunteer to test
all possible interactions of all possible dependent libraries.

However, this might not work for a source-level distribution as MacPorts.

>> I am not
>> dependent on the application developer to "fix" their program or to
>> resolve the conflict in any way.
>
> Report the bug to them and get it fixed anyway.

The question was not whether to report a bug or not, but whether the
actions of the developer of the dependent library need to impact
everyone else as much as they do. I think a large majority of
engineers will readily agree that decoupling is a good thing in large
systems.

>> My plugin is a dylib (or on Linux a
>> .so), but it links statically against all the dependencies, so loads
>> and runs without a problem in the application that uses different
>> versions of dependent libraries.
>
> If you didn't flatten your namespace, this wouldn't be an issue.  THAT is the real bug.

I usually target MacOS and Linux, sometimes also Windows.
I thought only MacOS provides two-level namespaces.
Am I wrong about this? I believe you agreed that this is should be a
non-MacOS specific issue.

>> Example 4: Buidling libraries and plugins - deployment
>> You can use the MacPorts provided libraries, yet provide a dynamic
>> library or a plugin that does not have extra dependencies. You will
>> find examples already of software inside MacPorts building "private"
>> versions of library for the very same reasons.
>
> What do you mean by this?

I can provide a single dylib (or .so in Linux) as a plugin file. No
other extra libraries/files to deploy on the system.

>> Example 4.2: "Convenience" libraries
>> This is a well-known subcase of Example 4. Static "tool" libraries
>> that are linked into dynamic libraries. This is pretty common across
>> open-source projects.
>> http://www.freesoftwaremagazine.com/articles/building_shared_libraries_once_using_autotools
>
> That article seems to completely argue against your point.

The article is not arguing whether to use static or dynamic libraries,
it's a false dichotomy anyway. It just gives some examples of when
convenience libraries are useful.

>> Example 5: Binary deployment.
>> Statically built binaries with MacPorts become trivially deployable
>> outside of MacPorts. No extra libraries to bundle with or ask the user
>> to compile, etc., no need for an installer, etc.
>
> I don't consider that something that we care about.

OK, I just want to make sure this is the general sentiment.

>> Example 6: Load time
>> Statically built binaries load very quickly. This comes at a price of
>> an increased cost in RAM. This used to be an issue when machines came
>> with an order of magnitude less RAM then they do today. The incurred
>> cost is small in absolute terms nowadays (end of 2011)
>
> 1) The common use case will have load times reduced by using dylibs instead because the dylib can be shared across different processes.  If you have ffmpeg and mencoder both running, they can share libx264.dylib, but they can't share libx264.a.  Thus, only one copy of libx264 needed to be loaded.  This saves time and memory pressure.

Maybe I am not up to date, but how does it result in a faster load
time? Doesn't dyld have a ton more work to do to resolve all the
symbols than in the statically linked target that offers only a few
symbols? With devices operating in hundreds of megabytes per second
(SSD) and tens of megabytes per second (platter hard drives), the time
to actually read from disk is negligible in the whole process, it's
the symbol lookup stuff that takes time.

> 2) Not every device is a powerhouse workstation connected to the grid.  Even if you have excessive amounts of memory available, that extra disk/memory I/O will cost you power.  In the case where you don't have excessive amounts of memory, well... all arguments points to reducing your memory footprint, and dylibs are the clear winner.

The biggest energy waster is the CPU, which would work a lot more to
do all the symbol resolution for a dynamically linked target. I don't
think extra memory I/O will meaningfully impact power consumption. The
extra disk I/O is negligible, especially in terms of power
consumption.

On which platforms is this a concern? How much wasted memory and disk
are we talking about? Is MacPorts targeting embedded platforms?

Are there plans at Apple to run MacPorts on an iPhone or iPad? :-)

>> --
> Yes, I agree that they should not need to special case Mac OS X deployment.  That has nothing to do with using static vs dynamic libraries.  That is a consideration that exists on pretty much every platform and is an argument that has nothing to do with Mac OS X.

OK I am really glad we're on the same page here.


More information about the macports-dev mailing list