[MacPorts] #70859: gmp @6.3.0: tests fail when built with clang on Intel only, but pass when assembly is disabled. Forcing ld_classic appears to fix the issue.

MacPorts noreply at macports.org
Wed Oct 2 16:57:53 UTC 2024


#70859: gmp @6.3.0: tests fail when built with clang on Intel only, but pass when
assembly is disabled. Forcing ld_classic appears to fix the issue.
-----------------------+---------------------------------
  Reporter:  haberg-1  |      Owner:  MarcusCalhoun-Lopez
      Type:  defect    |     Status:  assigned
  Priority:  Normal    |  Milestone:
 Component:  ports     |    Version:
Resolution:            |   Keywords:  ventura
      Port:  gmp       |
-----------------------+---------------------------------

Comment (by markmentovai):

 (MacPorts-specific: This is a message that I’m trying to post to gmp-
 bugs at gmplib.org, but it hasn’t landed there yet. There is nothing wrong
 with any compiler, either in Xcode or MacPorts. There is a bug in Apple’s
 new linker and it can occur using any compiler, but it’s not a bug that
 gmp needs to suffer, and it’s possible to avoid the bug without opting for
 the deprecated linker.)

 If you read nothing else, read this:

 **gmp-6.3.0 ships libtool-2.4.6 (2015-02-16). Update to libtool-2.4.7
 (2022-03-17) to solve this problem.**

 Details:

 There does appear to be a bug in Apple’s new linker (ld-new or ld-prime)
 when targeting x86_64, producing a Mach-O dynamic library (`clang
 -dynamiclib`), and using the flat namespace option (`-flat_namespace`). I
 observed this as a variety of crashes in `make check`. I investigated
 t-bdiv raising SIGILL in particular:

 {{{
 % lldb tests/mpn/.libs/t-bdiv
 (lldb) target create "tests/mpn/.libs/t-bdiv"
 Current executable set to '…/gmp-6.3.0.build/tests/mpn/.libs/t-bdiv'
 (x86_64).
 (lldb) env DYLD_LIBRARY_PATH=.libs
 (lldb) run
 Process 19802 launched: '…/gmp-6.3.0.build/tests/mpn/.libs/t-bdiv'
 (x86_64)
 Process 19802 stopped
 * thread #1, queue = 'com.apple.main-thread', stop reason =
 EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
     frame #0: 0x00000001000de806 libgmp.10.dylib`__gmpn_sub_n + 3
 Target 0: (t-bdiv) stopped.
 (lldb) disassemble
 libgmp.10.dylib`:
     0x1000de803 <+0>: jmpq   *0x11ecf(%rip)            ; (void
 *)0x00000001000a1a00: __gmpn_sub_n
 (lldb) disassemble -s 0x1000de803 -e 0x1000de80f
 libgmp.10.dylib`:
     0x1000de803 <+0>: jmpq   *0x11ecf(%rip)            ; (void
 *)0x00000001000a1a00: __gmpn_sub_n

 libgmp.10.dylib`:
     0x1000de809 <+0>: jmpq   *0x11ed1(%rip)            ; (void
 *)0x00000001000a1aae: __gmpn_sub_nc
 }}}

 With the fault address at 0x1000de806 falling partway through the
 instruction at 0x1000de803, this certainly would be a bad instruction.
 This code was assembled from
 https://gmplib.org/repo/gmp-6.3/file/62abbaeaab13/mpn/x86_64/core2/aors_n.asm,
 at the bottom of the file has `__gmpn_sub_nc` jumping to within (but not
 the beginning of) `__gmpn_sub_n`. Duplicating that structure in a reduced
 testcase:

 {{{
 % cat ts_x86-64.s
 .text
 .globl _F
 .p2align 4, 0x90
 _F:
   movl $1, %eax
 Lcommon:
   shll %eax
   retq

 .globl _G
 .p2align 4, 0x90
 _G:
   movl $2, %eax
   jmp Lcommon
 % cat tc.c
 int F();
 int G();

 int main(int argc, char* argv[]) {
   return G();
 }
 }}}

 The problem is easily reproduced:

 {{{
 % clang -dynamiclib -flat_namespace -o libt.dylib ts_x86-64.s
 % clang -o t tc.c libt.dylib
 % ./t
 zsh: segmentation fault  ./t
 }}}

 This dylib is small enough to observe what’s going on inside directly:

 {{{
 % objdump -d libt.dylib

 libt.dylib: file format mach-o 64-bit x86-64

 Disassembly of section __TEXT,__text:

 0000000000000f80 <_F>:
      f80: b8 01 00 00 00               movl $1, %eax
      f85: d1 e0                         shll %eax
      f87: c3                           retq
      f88: 0f 1f 84 00 00 00 00 00       nopl (%rax,%rax)

 0000000000000f90 <_G>:
      f90: b8 02 00 00 00               movl $2, %eax
      f95: e9 05 00 00 00               jmp 0xf9f

 Disassembly of section __TEXT,__stubs:

 0000000000000f9a <__stubs>:
      f9a: ff 25 60 00 00 00             jmpq *96(%rip)               ##
 0x1000
 }}}

 The jump at 0xf95 is bad: 0xf9f is a bad jump target. As before, that
 address lies within another instruction (in this case, the last byte of
 the instruction at 0xf9a). In fact, that’s the very last byte of the
 section:

 {{{
 % otool -l libt.dylib
 […]
 Section
   sectname __stubs
    segname __TEXT
       addr 0x0000000000000f9a
       size 0x0000000000000006
 […]
 Section
   sectname __unwind_info
    segname __TEXT
       addr 0x0000000000000fa0
       size 0x0000000000000058
 […]
 }}}

 The jump at 0xf95 should target 0xf85, or _G + 0x5. For some reason, the
 linker created a stub for this jump (which itself shouldn’t be necessary)
 and then, instead of arranging for the stub to resolve and jump to _G +
 0x5, jumped to offset 0x5 within the stub.

 This is a clear bug in the linker, and I’ll report it to Apple, but don’t
 know that anyone could expect much traction.

 That doesn’t need to be the end of the story. There’s another concern
 here: this bug only occurs with -flat_namespace. gmp shouldn’t need
 -flat_namespace, and in fact it’s undesirable to enable it. It’s coming
 into this build from configure, via aclocal.m4, having been included from
 libtool.m4. In libtool-2.4.6, which gmp-6.3.0 is using, that’s
 https://git.savannah.gnu.org/cgit/libtool.git/tree/m4/libtool.m4?h=v2.4.6#n1070.
 In particular, it intends to enable -flat_namespace only on very early Mac
 OS X versions (pre-10.4, in the PowerPC-only era). But the case that we’d
 like to hit, assuming `MACOSX_DEPLOYMENT_TARGET` is unset (as it normally
 would be), doesn’t match `$host` on a modern macOS system, because the
 Darwin version has marched past 20, while the pattern only contemplates
 versions up to 19.

 https://git.savannah.gnu.org/cgit/libtool.git/commit/m4/libtool.m4?id=9e8c882517082fe5755f2524d23efb02f1522490,
 in libtool-2.4.7, modernizes this check in libtool, and with that in use,
 does not enable -flat_namespace in this situation. Upgrading libtool in
 gmp to that version will fix this problem. I ran `autoreconf --install`
 with autoconf-2.69, automake-1.15, and libtool-2.4.7, and observed a clean
 `make check` on macOS 14.7 x86_64 (nehalem-apple-darwin23.6.0)/Xcode 15.4
 and macOS 15.0 x86_64 (nehalem-apple-darwin24.0.0)/Xcode 16.0. In both
 cases, the linker is ld-new/ld-prime (no `-ld_classic`).

-- 
Ticket URL: <https://trac.macports.org/ticket/70859#comment:54>
MacPorts <https://www.macports.org/>
Ports system for macOS


More information about the macports-tickets mailing list