thread_local storage on 10.6.8 (and earlier) with clang-7.0

Jeremy Sequoia jeremyhu at apple.com
Mon Dec 10 05:50:41 UTC 2018



Sent from my iPhone...

> On Dec 7, 2018, at 11:29, Ken Cunningham <ken.cunningham.webuse at gmail.com> wrote:
> 
> 
> 
> As per the details in the previous post below that I sent a few weeks ago, I have clang/llvm thread_local support working nicely on 10.6.8 and earlier locally.  This is becoming a common issue now.
> 
> On 10.6.8 and earlier thread_local is implemented using existing llvm infrastructure that is in place for other OS systems that needed emulated TLS. I enabled it and forced it to be used for 10.6 and earlier on macOS. This is exactly the same emulated_tls that gcc uses.
> 
> It works perfectly well when linked against macports-libstdc++ (currently the default) as the supporting objects are built into libgcc. So from that point of view it could roll out right now, for stock users who are not using libc++.
> 
> That is really quite simple enough, and I should probably put that into a PR and get it on the PR list soon.
> 
> 
> 
> However I also have thread_local working when linking against libc++, and it would be my preference to fully fix this and share the solution to all before rolling it out.
> 
> The issue is the "___emutls_get_address" function, which is called from libc++abi.dylib, and is built into clang_rt:
> 
> $ nm /opt/local/libexec/llvm-5.0/lib/clang/5.0.1/lib/darwin/libclang_rt.10.4.a  | grep emu
> /opt/local/libexec/llvm-5.0/lib/clang/5.0.1/lib/darwin/libclang_rt.10.4.a(emutls.c.o):
> 0000000000000000 T ___emutls_get_address
> 00000000000001a0 t _emutls_init
> 0000000000000210 d _emutls_init_once.once
> 00000000000001c0 t _emutls_key_destructor
> 0000000000000220 d _emutls_mutex
> 00000000000002f8 b _emutls_num_object
> 0000000000000300 b _emutls_pthread_key

> These objects appear to be included in the built executables,  but are apparently NOT visible to libc++abi.dylib at runtime (you can see the small "t" indicating they are local only, and generate a runtime error when you try to run it.

Are they intended to be exported?  If so, you should be able to add the to the exports list or adjust the symbol visibility at compile time (I forget which approach libc++abi uses).

> $ nm 4 | grep emu
> 0000000100007070 t ___emutls_get_address
> 0000000100009480 d ___emutls_v._ZGVZL18thread_with_accessPvE23counting_function_local
> 0000000100009400 d ___emutls_v._ZL15counting_static
> 0000000100009420 d ___emutls_v._ZN12_GLOBAL__N_128counting_anonymous_namespaceE
> 0000000100009460 d ___emutls_v._ZZL18thread_with_accessPvE23counting_function_local
> 00000001000094a0 d ___emutls_v.__tls_guard
> 0000000100009440 d ___emutls_v.counting_extern
> 0000000100007200 t _emutls_init
> 00000001000094c0 d _emutls_init_once.once
> 0000000100007220 t _emutls_key_destructor
> 00000001000094d0 d _emutls_mutex
> 0000000100009538 b _emutls_num_object
> 0000000100009540 b _emutls_pthread_key
> 
> 
> $ ./4
> info: testing pthread_create
> dyld: lazy symbol binding failed: Symbol not found: ___emutls_get_address
>  Referenced from: /usr/lib/libc++abi.dylib
>  Expected in: flat namespace

How is this even linking?  Are you using -undefined dynamic_lookup?  Please don’t as that masks problems.


> dyld: Symbol not found: ___emutls_get_address
>  Referenced from: /usr/lib/libc++abi.dylib
>  Expected in: flat namespace
> 
> Trace/BPT trap
> 
> 
> 
> 
> Adding -Wl,-flat_namespace does not make the symbols visible.

Yeah, please also don’t use a flat namespace as that leads to other problems.  Darwin’s 2 level namespace is a great feature.  Dont try to be clever by subverting it; you’ll be sorry ;)

> If I try to fix it like this by forcing the symbol to be exported, I get a clue:
> 
> $ clang++ -std=c++11 -stdlib=libc++ -Wl,-exported_symbol,___emutls_get_address  -o 4 4.cpp
> ld: warning: cannot export hidden symbol ___emutls_get_address from /opt/local/libexec/llvm-5.0/lib/clang/5.0.1/lib/darwin/libclang_rt.osx.a(emutls.c.o)

It is explicitly marked as hidden so as to not be exported.  Do you know why?

> Things DO work however if I build emutls.c directly into libc++abi.dylib, however, using a slightly modified version of libcxxabi that includes the emutls.c source file in the build:
> 
> $ nm libc++abi.kenspecial.20181206.dylib | grep emutls_
> 0000000000020054 T ___emutls_get_address
> 0000000000025458 s ___emutls_t._ZN10__cxxabiv112_GLOBAL__N_15dtorsE
> 000000000002a270 d ___emutls_v._ZN10__cxxabiv112_GLOBAL__N_111dtors_aliveE
> 000000000002a290 d ___emutls_v._ZN10__cxxabiv112_GLOBAL__N_15dtorsE
> 000000000002a2b0 d _emutls_get_index.once
> 00000000000201c8 t _emutls_init
> 00000000000201ea t _emutls_key_destructor
> 000000000002a2c0 d _emutls_mutex
> 000000000002a360 b _emutls_num_object
> 000000000002a368 b _emutls_pthread_key
> 
> 
> and with this modified libc++abi.dylib, all the symbols are found, things work correctly, and all the tests that should pass in the llvm "testit" suite do pass.
> 
> 
> SO here is my question:
> 
> is there some simple way I am missing to force the symbols from emutls.c to be made visible in the executables built by clang using libclang_rt.10.4.a that I am not seeing? If so, I would  need to change very little here, other than making those symbols visible, however that is done.

Try adjusting the symbol visibility attribute or the -fvisibility command line argument.

> OR -- if there is no way -- I can roll emutls.c into libc++abi.dylib. It's a tiny bit tricky, and might well need to be manual steps. You have to build clang-5.0 first with thread support enabled as below, and then use that to build the new libc++abi.dylib. And once that new libc++abi.dylib is installed, everything works.

Yuck.  Sounds like typical bootstrapping hell :/. Maybe make +emutls a variant on libcxxabi.

> Ken
> 
> 
> 
> 
> 
> 
> 
>> On 2018-10-05, at 10:50 AM, Ken Cunningham wrote:
>> 
>> With a couple of very minor modifications, recent versions of clang will support thread_local storage including support for complex destructors on 10.6.8 using the same emutls.c system that gcc uses to support it. I'll attach the (amazingly simple) patches below. It picks up the emutls.c support from libgcc and libstdc++:
>> 
>> $ nm  /opt/local/lib/libgcc/libstdc++.6.dylib | grep cxa_thread
>> 0000000000001e16 T ___cxa_thread_atexit
>> 
>> $ nm /opt/local/lib/libgcc/libgcc_s.1.dylib | grep emu
>> 000000000000bd56 T ___emutls_get_address
>> 000000000000bec7 T ___emutls_register_common
>> 
>> 
>> This works when we use the MacPorts alternate c++11 support Marcus came up with, using
>> 
>> -stdlib=macports-libstdc++
>> 
>> and it seems to me to pass all the tests of how thread_local should work that I can throw at it. So this seems very usable to me at present.
>> 
>> 
>> 
>> 
>> However, when using libc++ (my preferred setup) I'm still having a couple of issues getting cxa_thread_atexit and emutls_get_address to build into libc++abi and show up at runtime.
>> 
>> Using RJVB's libcxx Port, I have built libc++abi more in the LINUX way, with cxa_thread_atexit.cpp included (that was fairly easy), but it doesn't find the symbols in 
>> 
>> /opt/local/libexec/llvm-7.0/lib/clang/7.0.0/lib/darwin/libclang_rt.osx.a
>> 
>> that it needs to find and dies at runtime. I think it's possibly a visibility thing.
>> 
>> If there is anyone out there with any skills and interest in working on this please speak up and I'll bring you fully up to speed offline regarding my libc++/libc++abi build efforts
>> 
>> Best,
>> 
>> Ken
>> 
>> 
>> Patches for llvm/clang-7.0 to enable thread_local support using -stdlib=macports-libstdc++:
>> 
>> ==========================
>> --- a/include/llvm/ADT/Triple.h.orig    2018-10-02 17:38:10.000000000 -0700
>> +++ b/include/llvm/ADT/Triple.h    2018-10-02 17:38:58.000000000 -0700
>> @@ -682,7 +682,7 @@
>> 
>>  /// Tests whether the target uses emulated TLS as default.
>>  bool hasDefaultEmulatedTLS() const {
>> -    return isAndroid() || isOSOpenBSD() || isWindowsCygwinEnvironment();
>> +    return isAndroid() || isOSOpenBSD() || isWindowsCygwinEnvironment() || isMacOSXVersionLT(10, 7);
>>  }
>> 
>>  /// @}
>> ==========================
>> --- a/tools/clang/lib/CodeGen/ItaniumCXXABI.cpp.orig    2018-10-02 18:31:17.000000000 -0700
>> +++ b/tools/clang/lib/CodeGen/ItaniumCXXABI.cpp    2018-10-02 18:32:35.000000000 -0700
>> @@ -2255,7 +2255,7 @@
>>  const char *Name = "__cxa_atexit";
>>  if (TLS) {
>>    const llvm::Triple &T = CGF.getTarget().getTriple();
>> -    Name = T.isOSDarwin() ?  "_tlv_atexit" : "__cxa_thread_atexit";
>> +    Name = (T.isOSDarwin() && !T.isMacOSXVersionLT(10, 7)) ?  "_tlv_atexit" : "__cxa_thread_atexit";
>>  }
>> 
>>  // We're assuming that the destructor function is something we can
>> --- a/tools/clang/lib/Basic/Targets/OSTargets.h.orig    2018-10-02 17:14:10.000000000 -0700
>> +++ b/tools/clang/lib/Basic/Targets/OSTargets.h    2018-10-02 17:14:41.000000000 -0700
>> @@ -93,7 +93,7 @@
>>    this->TLSSupported = false;
>> 
>>    if (Triple.isMacOSX())
>> -      this->TLSSupported = !Triple.isMacOSXVersionLT(10, 7);
>> +      this->TLSSupported = !Triple.isMacOSXVersionLT(10, 4);
>>    else if (Triple.isiOS()) {
>>      // 64-bit iOS supported it from 8 onwards, 32-bit device from 9 onwards,
>>      // 32-bit simulator from 10 onwards.
>> ==========================
>> 
> 


More information about the macports-dev mailing list