fix(cli): sanitize newlines in min-text output format#3425
Conversation
|
Hi @Karnav018! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
There was a problem hiding this comment.
Pull request overview
This PR addresses the --output-format min-text contract of one error per line by replacing embedded newlines in msg_header with spaces before formatting in both the plain (write_line) and colorized (print_colors) min-text code paths. Verbose, JSON, and GitHub Actions outputs are left unchanged. A unit test is added that constructs an error with an embedded newline and asserts the resulting min-text output collapses to a single line.
Changes:
- Sanitize
\n→ space inError::write_linenon-verbose branch. - Mirror the same sanitization in
Error::print_colorsnon-verbose branch. - Add
test_mintext_format_sanitizes_newlinesunit test.
Comments suppressed due to low confidence (1)
pyrefly/lib/error/error.rs:133
- The identical sanitization (
self.msg_header.replace('\n', " ")) is duplicated here and inwrite_lineat line 101. Consider extracting it into a small helper (e.g., asanitized_header()method onError, or sanitizing once at construction inError::new) so the two code paths can't drift apart, and so any future addition (such as also stripping\r) only has to be made in one place.
let sanitized_msg = self.msg_header.replace('\n', " ");
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "second part of message should be present" | ||
| ); | ||
| // Verify the format is still min-text: "SEVERITY path:range: message [kind]" | ||
| assert!(error_line.starts_with("ERROR test.py:1:1-11:"), "output should start with 'ERROR test.py:1:1-11:'"); |
There was a problem hiding this comment.
Thanks for catching this. We removed test_mintext_format_sanitizes_newlines entirely when moving to the header invariant approach, so this assertion is no longer present. Marking this resolved.
| // In min-text format, sanitize the message to ensure one error per line | ||
| // by replacing newlines with spaces. | ||
| let sanitized_msg = self.msg_header.replace('\n', " "); |
There was a problem hiding this comment.
We removed the min-text sanitization entirely and now enforce the invariant at construction: headers must not contain \n or \r. Error::new asserts !header.contains(['\n', '\r']), and test_error_header_rejects_newlines covers CRLF explicitly. This prevents CR from ever reaching output (including the spot you called out around line 133) and keeps min-text single-line without post-processing.
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
rchen152
left a comment
There was a problem hiding this comment.
Thanks for the PR! I'd actually prefer to fix this issue in a different way, by fixing specific problematic error messages rather than sanitizing them after the fact. The reason is that the sanitization approach will cause us to end up with really long messages that are technically single-line but in practice are long and hard-to-read.
Pyrefly error messages include a "header" line and zero or more "detail" lines. As I understand it, the problem is that we sometimes put the entirety of a multi-line error message in the header. So the approach I'd recommend is (1) adding an assertion when building a new error that the header does not contain any newlines, and (2) for any existing error messages that violate this assertion (running the tests should flush them out), changing the code that logs the error to properly split the message between header and details.
Why: min-text output requires one error per line, but some errors built multiline headers. What: assert headers contain no newline/CR and split missing-attribute union failures into header plus details. Why it works: invalid headers are rejected at construction and details carry the extra lines.
rchen152
left a comment
There was a problem hiding this comment.
Thanks! Could you (1) fix the test failures caused by the attribute error message changing and (2) add the error kind to the assert! message? Once you do that, I can import the PR and patch the mypy_primer failures internally, since the fixes should be pretty mechanical at that point.
| let display_range = module.display_range(range); | ||
| assert!( | ||
| !header.contains(['\n', '\r']), | ||
| "error header must not contain newlines" |
There was a problem hiding this comment.
We should include the error kind in this message, so that assertion failures are easier to debug.
Summary
This PR addresses an issue where the
--output-format min-textformat breaks its single-line execution contract when handling multi-line error descriptions (such as evaluation errors across union types).Because
min-textis specifically designed for predictable line-by-line parsing by downstream automated scripts, internal newlines (\n) present within error messages break external tools.Changes Made
Modified
pyrefly/lib/error/error.rsacross two methods handling output serialization:Error::write_line(): Sanitizes themsg_headerusing.replace('\n', " ")whenverboseexecution is disabled (representingmin-textmode).Error::print_colors(): Mirrors the sanitization mapping to maintain terminal output consistency across all stdout/stderr channels.Verbose mode, JSON mode, and GitHub actions format remain entirely untouched, preserving their native structured data or human-oriented pretty-printing layouts.
Fixes the issue where multi-line errors broke the one-error-per-line parsing agreement in
min-text.Test Plan
Automated Coverage
Added a new dedicated integration unit test inside
pyrefly/lib/error/error.rs:test_mintext_format_sanitizes_newlines(). This test:verbose=falseto emulate themin-textexecution logic.write_line()squashes the payload into a singular string line while retaining text integrity.Execution Results
Verified locally across the local environment suite: