You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
--strip-ansi=always left two classes of ANSI escape sequence in the output:
8-bit C1 introducers (U+0090, U+0098, U+009B, U+009D, U+009E, U+009F). On terminals that interpret 8-bit C1 in UTF-8 (kitty, by default; older xterm and VTE configurations), these are the single-codepoint equivalents of ESC P, ESC X, ESC [, ESC ], ESC ^, ESC _ . They introduce DCS, SOS, CSI, OSC, PM, and APC sequences respectively. Bat's parser treated them as text.
DCS / SOS / PM / APC bodies. Even when introduced by the 7-bit ESC P/X/^/_, the body up to the string terminator was emitted as text and survived strip_ansi.
This patch teaches EscapeSequenceOffsetsIterator to recognise the 8-bit introducers as their 7-bit equivalents, and to consume string-terminated bodies (via either form of introducer) as a single opaque Unknown segment. Both then drop out of strip_ansi along with the existing CSI/OSC handling.
A small unit test in src/preprocessor.rs covers the three new cases (8-bit CSI, 7-bit DCS body, 8-bit DCS body with 8-bit ST).
Note:
--strip-ansi=always is bypassed entirely when bat selects SimplePrinter (i.e. when stdout is piped or --color=never is set), so bat --strip-ansi=always file | grep … returns the raw escape sequences.
The existing strip_ansi_does_not_affect_simple_printer test seems to lock this in deliberately. is that still the intended scope, or shoudl this be fixed? strip-ansi=always implies it always filters but thats not true
Note:
--strip-ansi=always is bypassed entirely when bat selects SimplePrinter (i.e. when stdout is piped or --color=never is set), so bat --strip-ansi=always file | grep … returns the raw escape sequences.
The existing strip_ansi_does_not_affect_simple_printer test seems to lock this in deliberately. is that still the intended scope, or shoudl this be fixed? strip-ansi=always implies it always filters but thats not true
I believe the intention is to leave the output unchanged when piping in scripts (and cat is an alias to bat and strip-ansi comes from the config file instead of the command line...)
Probably bat could benefit from a refactor which would always honor command line arguments, as this is quite a common problem I think, but maybe it makes sense to leave it as it is for this PR, what do you think?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
--strip-ansi=alwaysleft two classes of ANSI escape sequence in the output:ESC P,ESC X,ESC [,ESC ],ESC ^,ESC _. They introduce DCS, SOS, CSI, OSC, PM, and APC sequences respectively. Bat's parser treated them as text.ESC P/X/^/_, the body up to the string terminator was emitted as text and survivedstrip_ansi.This patch teaches
EscapeSequenceOffsetsIteratorto recognise the 8-bit introducers as their 7-bit equivalents, and to consume string-terminated bodies (via either form of introducer) as a single opaqueUnknownsegment. Both then drop out ofstrip_ansialong with the existing CSI/OSC handling.A small unit test in
src/preprocessor.rscovers the three new cases (8-bit CSI, 7-bit DCS body, 8-bit DCS body with 8-bit ST).Note:
--strip-ansi=always is bypassed entirely when bat selects SimplePrinter (i.e. when stdout is piped or --color=never is set), so bat --strip-ansi=always file | grep … returns the raw escape sequences.
The existing strip_ansi_does_not_affect_simple_printer test seems to lock this in deliberately. is that still the intended scope, or shoudl this be fixed? strip-ansi=always implies it always filters but thats not true