[MacPorts] #71237: sed: RE error: illegal byte sequence
MacPorts
noreply at macports.org
Mon Nov 4 20:43:11 UTC 2024
#71237: sed: RE error: illegal byte sequence
---------------------------+--------------------
Reporter: ballapete | Owner: (none)
Type: defect | Status: new
Priority: Normal | Milestone:
Component: ports | Version: 2.10.2
Resolution: | Keywords:
Port: Perl modules |
---------------------------+--------------------
Old description:
> While trying to proof that Perl 5.38 is ready for production use I tried
> in a final step to patch all the test files that start with
> `#!/usr/bin/perl` (or similarly) to start with `#!/usr/bin/env perl` or
> `${perl5.bin}` and ran into that `sed` problem. On the command line it
> gives:
>
> {{{
> pete 313 /\ head -20
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
> -dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | sed -e
> 's:/usr/bin/perl:/usr/bin/env perl:'
> while (<DATA>) {
> sed: RE error: illegal byte sequence
> Exit 1
> pete 314 /\ which sed
> }}}
>
> and also:
>
> {{{
> pete 312 /\ head -20
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
> -dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | gsed -e
> 's:/usr/bin/perl:/usr/bin/env perl:'
> while (<DATA>) {
> if (s/^\253(\d+)\273\s*-?\s*//) {
> chomp;
> }}}
>
> So it's likely that using `gsed` instead of the `system's sed` will solve
> the problem.
>
> How can I make `port` use `gsed` instead of the `system's sed`?
>
> Another example:
>
> {{{
> pete 318 /\ head -47
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
> -dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | sed
> -e 's:/usr/bin/perl:/usr/bin/env perl:'
> # Now try the explicit type settings
> ok ($sth->bind_param (1, " 4", &SQL_INTEGER), "bind 4 int");
> sed: RE error: illegal byte sequence
> Exit 1
>
> pete 319 /\ head -47
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
> -dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 |
> gsed -e 's:/usr/bin/perl:/usr/bin/env perl:'
> # Now try the explicit type settings
> ok ($sth->bind_param (1, " 4", &SQL_INTEGER), "bind 4 int");
> ok ($sth->bind_param (2, "Andreas K\366nig"), "bind str");
> }}}
>
> Another one would be
>
> {{{
> head -55
> /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-encode/p5.38-encode/work/Encode-3.21/t
> /at-cn.t | tail -7 | {g,}sed -e 's:/usr/bin/perl:/usr/bin/env perl:'
> }}}
>
> where Chinese characters are somehow "encoded".
>
> Obviously `sed` "knows" some "forbidden" characters it cannot work on,
> and obviously it has problems with encodings other than 7 or 8 bit that
> gsed has learned to overcome.
> }}}
New description:
While trying to proof that Perl 5.38 is ready for production use I tried
in a final step to patch all the test files that start with
`#!/usr/bin/perl` (or similarly) to start with `#!/usr/bin/env perl` or
`${perl5.bin}` and ran into that `sed` problem. On the command line it
gives:
{{{
pete 313 /\ head -20
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | sed -e
's:/usr/bin/perl:/usr/bin/env perl:'
while (<DATA>) {
sed: RE error: illegal byte sequence
Exit 1
pete 314 /\ which sed
}}}
and also:
{{{
pete 312 /\ head -20
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/80_rt.t | tail -3 | gsed -e
's:/usr/bin/perl:/usr/bin/env perl:'
while (<DATA>) {
if (s/^\253(\d+)\273\s*-?\s*//) {
chomp;
}}}
So it's likely that using `gsed` instead of the `system's sed` will solve
the problem.
How can I make `port` use `gsed` instead of the `system's sed`?
Another example:
{{{
pete 318 /\ head -47
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | sed
-e 's:/usr/bin/perl:/usr/bin/env perl:'
# Now try the explicit type settings
ok ($sth->bind_param (1, " 4", &SQL_INTEGER), "bind 4 int");
sed: RE error: illegal byte sequence
Exit 1
pete 319 /\ head -47
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5
-dbd-csv/p5.38-dbd-csv/work/DBD-CSV-0.60/t/42_bindparam.t | tail -3 | gsed
-e 's:/usr/bin/perl:/usr/bin/env perl:'
# Now try the explicit type settings
ok ($sth->bind_param (1, " 4", &SQL_INTEGER), "bind 4 int");
ok ($sth->bind_param (2, "Andreas K\366nig"), "bind str");
}}}
Another one would be
{{{
head -55
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_perl_p5-encode/p5.38-encode/work/Encode-3.21/t
/at-cn.t | tail -7 | {g,}sed -e 's:/usr/bin/perl:/usr/bin/env perl:'
}}}
where Chinese characters are somehow "encoded".
Obviously `sed` "knows" some "forbidden" characters it cannot work on, and
obviously it has problems with encodings other than 7 or 8 bit that gsed
has learned to overcome.
--
Comment (by ryandesign):
System `sed` expects input to be UTF-8 by default; you're trying to use it
on files that aren't UTF-8. Working around this by using `gsed` instead is
not necessary nor recommended. You can still use system `sed` as long as
you set the `LC_CTYPE` environment variable either to the correct locale
or just to `C`.
--
Ticket URL: <https://trac.macports.org/ticket/71237#comment:1>
MacPorts <https://www.macports.org/>
Ports system for macOS
More information about the macports-tickets
mailing list