Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binmode is a necessity for win32 usability and must be default #116

Open
wchristian opened this issue Mar 27, 2018 · 15 comments
Open

binmode is a necessity for win32 usability and must be default #116

wchristian opened this issue Mar 27, 2018 · 15 comments
Labels
Debian Reported in Debain ticket system Feature Feature Request

Comments

@wchristian
Copy link
Collaborator

Just leaving this as a note, since i recently ran into it.

Was trying to use a library that calls an external program to generate PDF and another author had decided to use IPC::Run inside the library. Since PDFs are binary data by default, the result was of course mojibake, and due to the entirety of the situation there was no way to resolve this other than to replace IPC::Run entirely.

I cannot offhand think of a situation where enabling binmode by default would cause any serious breakage that couldn't be managed (e.g. tests that don't expect this behavior).

@mohawk2
Copy link
Collaborator

mohawk2 commented Mar 27, 2018

I tend to agree. @toddr ?

@toddr
Copy link
Member

toddr commented Mar 27, 2018

Actually we have another ticket asking us to have an option to default to utf-8

@toddr
Copy link
Member

toddr commented Mar 27, 2018

It sounds like we need a global for this or if the run() interface supports it, maybe it could be an option passed in.

@wchristian
Copy link
Collaborator Author

Defaulting to UTF8 might be acceptable too, as it disables newline mangling as well. I'd still highly recommend you actually try and see what happens if you make utf8 default in various environments when dealing with 3rd party binary producers. I expect some breakage, for example when dealing with the pure ascii windows cmd shell.

Keep in mind that making binmode doesn't add any data transformation, it disables a dangerous one. Making UTF8 default would indeed add a transformation.

@wchristian
Copy link
Collaborator Author

Btw, which ticket are you talking about?

@toddr
Copy link
Member

toddr commented Mar 29, 2018

@wchristian: Pumping UTF-8 data mangles it [rt.cpan.org #58853] #87

@toddr
Copy link
Member

toddr commented Mar 29, 2018

#87 has an example script. Would the change you're proposing here for binmode also fix it? If someone can test that then I'd be all for just setting the default. Ideally though we should probably still try to provide a way to somehow alter the mode at run time?

@toddr toddr added the Feature Feature Request label Mar 29, 2018
@wchristian
Copy link
Collaborator Author

Reading the other one now.

As for the second question:

Yes, binary mode should be the default, but allowing customization is also key, due to it being hard to configure the called code in most cases.

@wchristian
Copy link
Collaborator Author

Not sure if the binmode thing would fix anything for UTF, but i think it might help improve the situation. I've also rewritten the test script there a little to show the newline issue:

#!/usr/bin/perl
use strict;
use warnings;
use B 'perlstring';

my $report_length  = 0;                                      # set this to 1 to switch string dumps to string length
my $mode           = ":raw";
my @lines          = ( "ab", "\n", "\r", "\r\n", "\n\r" );
my $display_length = 8;                                      # set this to something reasonable

sub fm { sprintf "%-${display_length}s", $report_length ? length $_[0] : perlstring $_[0] }

if ( defined $ENV{IPC_SUB_PROCESS} ) {
    binmode STDIN,  $mode;
    binmode STDERR, $mode;
    binmode STDOUT, $mode;
    my $in = do { local $/; <STDIN> };
    my $out = $lines[ $ENV{IPC_SUB_INDEX} ];
    print STDERR fm( $in ) . " | " . fm( $out ) . " out became: ";
    print $out;
    exit;
}

$ENV{IPC_SUB_PROCESS} = 1;
use IPC::Run ();
for my $i ( 0 .. $#lines ) {
    print fm( $lines[$i] ) . " in became: ";
    $ENV{IPC_SUB_INDEX} = $i;
    IPC::Run::run( [ "perl", __FILE__ ], "<", \$lines[$i], ">", \my $out );
    print fm( $out ) . "\n";
}

__END__

D:\>perl ipc_run.pl
"ab"     in became: "ab"     | "ab"     out became: "ab"
"\n"     in became: "\r\n"   | "\n"     out became: "\n"
"\r"     in became: "\r"     | "\r"     out became: "\r"
"\r\n"   in became: "\r\n"   | "\r\n"   out became: "\n"
"\n\r"   in became: "\r\n\r" | "\n\r"   out became: "\n\r"

@toddr
Copy link
Member

toddr commented Apr 2, 2018

@toddr toddr added the Debian Reported in Debain ticket system label Apr 2, 2018
@wchristian
Copy link
Collaborator Author

Oh god, yeah, implicit charset munging will get different results depending on locale. Even more power to the argument of making binmode default.

@toddr
Copy link
Member

toddr commented Apr 3, 2018

Ok. sounds like we need a patch. I'm overloaded this week so won't be able to look at it myself.

@mohawk2
Copy link
Collaborator

mohawk2 commented Apr 3, 2018

@wchristian Do you feel like constructing a little bit of .t that will capture the charset munging? I can sellotape that plus your script above into a test and then have a go at the code to fix it.

@wchristian
Copy link
Collaborator Author

Sent tests in #120.

@toddr
Copy link
Member

toddr commented Apr 5, 2018

Tests are in and presumably failing on windows. we need to get the windows smoker working.

djerius pushed a commit to djerius/IPC-Run that referenced this issue Apr 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Debian Reported in Debain ticket system Feature Feature Request
Projects
None yet
Development

No branches or pull requests

3 participants