thread_local storage on 10.6.8 (and earlier) with clang-7.0

Ken Cunningham ken.cunningham.webuse at
Fri Dec 7 19:29:43 UTC 2018

As per the details in the previous post below that I sent a few weeks ago, I have clang/llvm thread_local support working nicely on 10.6.8 and earlier locally.  This is becoming a common issue now.

On 10.6.8 and earlier thread_local is implemented using existing llvm infrastructure that is in place for other OS systems that needed emulated TLS. I enabled it and forced it to be used for 10.6 and earlier on macOS. This is exactly the same emulated_tls that gcc uses.

It works perfectly well when linked against macports-libstdc++ (currently the default) as the supporting objects are built into libgcc. So from that point of view it could roll out right now, for stock users who are not using libc++.

That is really quite simple enough, and I should probably put that into a PR and get it on the PR list soon.

However I also have thread_local working when linking against libc++, and it would be my preference to fully fix this and share the solution to all before rolling it out.

The issue is the "___emutls_get_address" function, which is called from libc++abi.dylib, and is built into clang_rt:

$ nm /opt/local/libexec/llvm-5.0/lib/clang/5.0.1/lib/darwin/libclang_rt.10.4.a  | grep emu
0000000000000000 T ___emutls_get_address
00000000000001a0 t _emutls_init
0000000000000210 d _emutls_init_once.once
00000000000001c0 t _emutls_key_destructor
0000000000000220 d _emutls_mutex
00000000000002f8 b _emutls_num_object
0000000000000300 b _emutls_pthread_key

These objects appear to be included in the built executables,  but are apparently NOT visible to libc++abi.dylib at runtime (you can see the small "t" indicating they are local only, and generate a runtime error when you try to run it.

$ nm 4 | grep emu
0000000100007070 t ___emutls_get_address
0000000100009480 d ___emutls_v._ZGVZL18thread_with_accessPvE23counting_function_local
0000000100009400 d ___emutls_v._ZL15counting_static
0000000100009420 d ___emutls_v._ZN12_GLOBAL__N_128counting_anonymous_namespaceE
0000000100009460 d ___emutls_v._ZZL18thread_with_accessPvE23counting_function_local
00000001000094a0 d ___emutls_v.__tls_guard
0000000100009440 d ___emutls_v.counting_extern
0000000100007200 t _emutls_init
00000001000094c0 d _emutls_init_once.once
0000000100007220 t _emutls_key_destructor
00000001000094d0 d _emutls_mutex
0000000100009538 b _emutls_num_object
0000000100009540 b _emutls_pthread_key

$ ./4
info: testing pthread_create
dyld: lazy symbol binding failed: Symbol not found: ___emutls_get_address
  Referenced from: /usr/lib/libc++abi.dylib
  Expected in: flat namespace

dyld: Symbol not found: ___emutls_get_address
  Referenced from: /usr/lib/libc++abi.dylib
  Expected in: flat namespace

Trace/BPT trap

Adding -Wl,-flat_namespace does not make the symbols visible.

If I try to fix it like this by forcing the symbol to be exported, I get a clue:

$ clang++ -std=c++11 -stdlib=libc++ -Wl,-exported_symbol,___emutls_get_address  -o 4 4.cpp
ld: warning: cannot export hidden symbol ___emutls_get_address from /opt/local/libexec/llvm-5.0/lib/clang/5.0.1/lib/darwin/libclang_rt.osx.a(emutls.c.o)

Things DO work however if I build emutls.c directly into libc++abi.dylib, however, using a slightly modified version of libcxxabi that includes the emutls.c source file in the build:

$ nm libc++abi.kenspecial.20181206.dylib | grep emutls_
0000000000020054 T ___emutls_get_address
0000000000025458 s ___emutls_t._ZN10__cxxabiv112_GLOBAL__N_15dtorsE
000000000002a270 d ___emutls_v._ZN10__cxxabiv112_GLOBAL__N_111dtors_aliveE
000000000002a290 d ___emutls_v._ZN10__cxxabiv112_GLOBAL__N_15dtorsE
000000000002a2b0 d _emutls_get_index.once
00000000000201c8 t _emutls_init
00000000000201ea t _emutls_key_destructor
000000000002a2c0 d _emutls_mutex
000000000002a360 b _emutls_num_object
000000000002a368 b _emutls_pthread_key

and with this modified libc++abi.dylib, all the symbols are found, things work correctly, and all the tests that should pass in the llvm "testit" suite do pass.

SO here is my question:

is there some simple way I am missing to force the symbols from emutls.c to be made visible in the executables built by clang using libclang_rt.10.4.a that I am not seeing? If so, I would  need to change very little here, other than making those symbols visible, however that is done.

OR -- if there is no way -- I can roll emutls.c into libc++abi.dylib. It's a tiny bit tricky, and might well need to be manual steps. You have to build clang-5.0 first with thread support enabled as below, and then use that to build the new libc++abi.dylib. And once that new libc++abi.dylib is installed, everything works.


On 2018-10-05, at 10:50 AM, Ken Cunningham wrote:

> With a couple of very minor modifications, recent versions of clang will support thread_local storage including support for complex destructors on 10.6.8 using the same emutls.c system that gcc uses to support it. I'll attach the (amazingly simple) patches below. It picks up the emutls.c support from libgcc and libstdc++:
> $ nm  /opt/local/lib/libgcc/libstdc++.6.dylib | grep cxa_thread
> 0000000000001e16 T ___cxa_thread_atexit
> $ nm /opt/local/lib/libgcc/libgcc_s.1.dylib | grep emu
> 000000000000bd56 T ___emutls_get_address
> 000000000000bec7 T ___emutls_register_common
> This works when we use the MacPorts alternate c++11 support Marcus came up with, using
> -stdlib=macports-libstdc++
> and it seems to me to pass all the tests of how thread_local should work that I can throw at it. So this seems very usable to me at present.
> However, when using libc++ (my preferred setup) I'm still having a couple of issues getting cxa_thread_atexit and emutls_get_address to build into libc++abi and show up at runtime.
> Using RJVB's libcxx Port, I have built libc++abi more in the LINUX way, with cxa_thread_atexit.cpp included (that was fairly easy), but it doesn't find the symbols in 
> /opt/local/libexec/llvm-7.0/lib/clang/7.0.0/lib/darwin/libclang_rt.osx.a
> that it needs to find and dies at runtime. I think it's possibly a visibility thing.
> If there is anyone out there with any skills and interest in working on this please speak up and I'll bring you fully up to speed offline regarding my libc++/libc++abi build efforts
> Best,
> Ken
> Patches for llvm/clang-7.0 to enable thread_local support using -stdlib=macports-libstdc++:
> ==========================
> --- a/include/llvm/ADT/Triple.h.orig	2018-10-02 17:38:10.000000000 -0700
> +++ b/include/llvm/ADT/Triple.h	2018-10-02 17:38:58.000000000 -0700
> @@ -682,7 +682,7 @@
>   /// Tests whether the target uses emulated TLS as default.
>   bool hasDefaultEmulatedTLS() const {
> -    return isAndroid() || isOSOpenBSD() || isWindowsCygwinEnvironment();
> +    return isAndroid() || isOSOpenBSD() || isWindowsCygwinEnvironment() || isMacOSXVersionLT(10, 7);
>   }
>   /// @}
> ==========================
> --- a/tools/clang/lib/CodeGen/ItaniumCXXABI.cpp.orig	2018-10-02 18:31:17.000000000 -0700
> +++ b/tools/clang/lib/CodeGen/ItaniumCXXABI.cpp	2018-10-02 18:32:35.000000000 -0700
> @@ -2255,7 +2255,7 @@
>   const char *Name = "__cxa_atexit";
>   if (TLS) {
>     const llvm::Triple &T = CGF.getTarget().getTriple();
> -    Name = T.isOSDarwin() ?  "_tlv_atexit" : "__cxa_thread_atexit";
> +    Name = (T.isOSDarwin() && !T.isMacOSXVersionLT(10, 7)) ?  "_tlv_atexit" : "__cxa_thread_atexit";
>   }
>   // We're assuming that the destructor function is something we can
> --- a/tools/clang/lib/Basic/Targets/OSTargets.h.orig	2018-10-02 17:14:10.000000000 -0700
> +++ b/tools/clang/lib/Basic/Targets/OSTargets.h	2018-10-02 17:14:41.000000000 -0700
> @@ -93,7 +93,7 @@
>     this->TLSSupported = false;
>     if (Triple.isMacOSX())
> -      this->TLSSupported = !Triple.isMacOSXVersionLT(10, 7);
> +      this->TLSSupported = !Triple.isMacOSXVersionLT(10, 4);
>     else if (Triple.isiOS()) {
>       // 64-bit iOS supported it from 8 onwards, 32-bit device from 9 onwards,
>       // 32-bit simulator from 10 onwards.
> ==========================

More information about the macports-dev mailing list