[MacPorts] #70670: libiconv @1.17_0: iconv on macOS Ventura 13.6+ does not perform correct conversions

MacPorts noreply at macports.org
Sun Sep 1 09:45:45 UTC 2024


#70670: libiconv @1.17_0: iconv on macOS Ventura 13.6+ does not perform correct
conversions
---------------------------+------------------------
  Reporter:  seamusdemora  |      Owner:  ryandesign
      Type:  defect        |     Status:  closed
  Priority:  Normal        |  Milestone:
 Component:  ports         |    Version:  2.10.1
Resolution:  invalid       |   Keywords:
      Port:  libiconv      |
---------------------------+------------------------
Changes (by ryandesign):

 * status:  assigned => closed
 * resolution:   => invalid


Old description:

> I'm trying to do something that seems simple (it can be done simply on my
> Linux box):
>
> I have C&P a line from a PDF file (a French programming guide), to
> Terminal.app:
> 'print("Numéro de boucle", i)'
>
> I wanted to convert this line to ASCII before pasting into my editor. So
> I used 'iconv' as shown below. In each case, I used 'file' to check the
> "from" encoding :
>
> {{{
> % echo 'print("Numéro de boucle", i)' | file -
> /dev/stdin: Unicode text, UTF-8 text
>
> % echo 'print("Numéro de boucle", i)' | iconv -f utf-8 -t ascii//translit
> print("Num'ero de boucle", i)
>
> ?!?!?!?   I tried another example:
>
> % echo "print("Protégé. Señorita. Coup de grâce", i)" | file -
> /dev/stdin: Unicode text, UTF-8 text
>
> % echo 'Protégé Señorita Coup de grâce' | iconv -f UTF-8 -t
> ASCII//TRANSLIT
> Prot'eg'e Se~norita Coup de gr^ace
> }}}
>
> **PLEASE NOTE: I have also tried using 'utf-8-mac' and 'utf8-mac' for the
> "from" encoding; thhis had no effect on the results - they were identical
> in all cases.**
>
> As you can see, this is not correct: a single quote has been added. I'm
> not a frequent user of 'iconv', so I checked this on my Debian 'bookworm'
> Linux box:
>
> {{{
> $ echo 'print("Numéro de boucle", i)' | iconv -f utf-8 -t ascii//translit
> print("Numero de boucle", i)
> }}}
>
> I've checked to confirm that the version of 'iconv' on my macOS Ventura
> 13.6+ is one from MacPorts. I believe that it is:
>
> {{{
> % whereis iconv
> iconv: /usr/bin/iconv /opt/local/share/man/man1/iconv.1.gz
>
> % port installed requested
> The following ports are currently installed:
> ...
> libiconv @1.17_0 (active)
> ...
> %
> }}}
>
> And confirmation of my macports version:
> {{{
> % port -v
> MacPorts 2.10.1
> }}}
>
> I can accept that it's broken, and I can accept that it can't be fixed
> (if that turns out to be the case). But I surely would appreciate an
> explanation of what has gone wrong - especially if it's something that I
> am doing incorrectly!
>
> Rgds,
> ~S

New description:

 I'm trying to do something that seems simple (it can be done simply on my
 Linux box):

 I have C&P a line from a PDF file (a French programming guide), to
 Terminal.app:

 {{{
 print("Numéro de boucle", i)'
 }}}

 I wanted to convert this line to ASCII before pasting into my editor. So I
 used 'iconv' as shown below. In each case, I used 'file' to check the
 "from" encoding :

 {{{
 % echo 'print("Numéro de boucle", i)' | file -
 /dev/stdin: Unicode text, UTF-8 text

 % echo 'print("Numéro de boucle", i)' | iconv -f utf-8 -t ascii//translit
 print("Num'ero de boucle", i)
 }}}

 ?!?!?!?   I tried another example:

 {{{
 % echo "print("Protégé. Señorita. Coup de grâce", i)" | file -
 /dev/stdin: Unicode text, UTF-8 text

 % echo 'Protégé Señorita Coup de grâce' | iconv -f UTF-8 -t
 ASCII//TRANSLIT
 Prot'eg'e Se~norita Coup de gr^ace
 }}}

 **PLEASE NOTE: I have also tried using 'utf-8-mac' and 'utf8-mac' for the
 "from" encoding; thhis had no effect on the results - they were identical
 in all cases.**

 As you can see, this is not correct: a single quote has been added. I'm
 not a frequent user of 'iconv', so I checked this on my Debian 'bookworm'
 Linux box:

 {{{
 $ echo 'print("Numéro de boucle", i)' | iconv -f utf-8 -t ascii//translit
 print("Numero de boucle", i)
 }}}

 I've checked to confirm that the version of 'iconv' on my macOS Ventura
 13.6+ is one from MacPorts. I believe that it is:

 {{{
 % whereis iconv
 iconv: /usr/bin/iconv /opt/local/share/man/man1/iconv.1.gz

 % port installed requested
 The following ports are currently installed:
 ...
 libiconv @1.17_0 (active)
 ...
 %
 }}}

 And confirmation of my macports version:
 {{{
 % port -v
 MacPorts 2.10.1
 }}}

 I can accept that it's broken, and I can accept that it can't be fixed (if
 that turns out to be the case). But I surely would appreciate an
 explanation of what has gone wrong - especially if it's something that I
 am doing incorrectly!

 Rgds,
 ~S

--

Comment:

 I get the same conversions as you (insertion of `'` and `^` after accented
 characters in an attempt to mimic in ASCII what those accents look like)
 regardless whether I use /usr/bin/iconv on macOS 12 (Apple's GNU libiconv
 1.11) or /opt/local/bin/iconv (MacPorts GNU libiconv 1.17) therefore it is
 not a MacPorts bug.

 I believe iconv uses locale information provided by the operating system
 to guide its conversions. Therefore your bug, I suppose, is with macOS,
 although I assume the result we observe is intentional and not considered
 a bug. In particular, what we're observing is called transliteration:

 https://www.gnu.org/software/libiconv/

 > It has also some limited support for transliteration, i.e. when a
 character cannot be represented in the target character set, it can be
 approximated through one or several similarly looking characters.
 Transliteration is activated when `//TRANSLIT` is appended to the target
 encoding name.

 You have specifically requested that transliteration be enabled.

 I don't know why you get different results on Linux. That is, it is
 presumably because the locale information provided by Linux differs from
 that provided by macOS, but I don't know why these two OS vendors have
 decided to do that. Possibly, the locale information on your Linux does
 not support transliteration therefore your request to enable
 transliteration is being ignored on Linux.

-- 
Ticket URL: <https://trac.macports.org/ticket/70670#comment:3>
MacPorts <https://www.macports.org/>
Ports system for macOS


More information about the macports-tickets mailing list