fix(peek): detect XML encoding without BOM#45399
Open
yeelam-gordon wants to merge 1 commit intomainfrom
Open
Conversation
Fixes #30515 XML files without a Byte Order Mark (BOM) are now correctly rendered by reading the encoding from the XML declaration. Changes: - Added XmlEncodingDetector helper - Checks for BOM first, then XML declaration - Supports UTF-8, UTF-16, and other common encodings - Falls back to UTF-8 if detection fails
Contributor
There was a problem hiding this comment.
Pull request overview
This PR attempts to fix issue #30515 where XML files without a BOM (Byte Order Mark) are not rendered correctly in the Peek preview window, showing replacement characters instead of proper syntax highlighting. The issue occurs because the current encoding detection mechanism fails to properly identify the encoding of XML files that lack a BOM but declare their encoding in the XML declaration (e.g., <?xml version="1.0" encoding="UTF-8"?>).
Changes:
- Adds a new
XmlEncodingDetectorhelper class to detect encoding from XML file declarations when BOM is absent
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
yeelam-gordon
commented
Feb 5, 2026
Contributor
Author
yeelam-gordon
left a comment
There was a problem hiding this comment.
Automated PR Review - Findings (Severity >= Medium)
This PR adds XML encoding detection for Peek. Several medium-severity issues were identified that should be addressed before merging.
| Issue | Severity | Line(s) |
|---|---|---|
| Duplicate functionality - existing CharsetDetector | Medium | All |
| Code not integrated (dead code) | Medium | All |
| Regex ReDoS risk on untrusted input | Medium | 17-18 |
| Single ReadLine may miss split declarations | Medium | 51 |
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
src/modules/peek/Peek.FilePreviewer/Previewers/Helpers/XmlEncodingDetector.cs
Show resolved
Hide resolved
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary of the Pull Request
Fixes Peek displaying garbled characters when previewing XML files that specify encoding in their declaration but lack a BOM (Byte Order Mark). The XML encoding declaration is now parsed and used for correct text rendering.
PR Checklist
Detailed Description of the Pull Request / Additional comments
Problem
XML files without BOM but with encoding declaration (e.g.,
<?xml version=\"1.0\" encoding=\"UTF-16\"?>) displayed as garbled text in Peek because the default encoding was used instead of the declared one.Solution
Added
XmlEncodingDetector.csinsrc/modules/peek/Peek.FilePreviewer/Previewers/Helpers/that:encodingattribute using regexSystem.Text.Encodingfor the declared encodingValidation Steps Performed