fix: preserve BOM for UTF-8 encoded files by michaelhthomas · Pull Request #31 · dprint/dprint-plugin-roslyn

michaelhthomas · 2026-06-08T18:38:18Z

Issue

When formatting a file with a UTF-8 BOM, the plugin stripped it. This was causing a large number of diffs in my organization's repository, as it seems Visual Studio defaults to outputting files in UTF-8 with a BOM (for no apparent reason). This PR attempts to address this and preserve the same encoding as the input.

The root cause was in how SourceText was serialized back to bytes: SourceText.From(stream, encoding: null) correctly detected the BOM and set sourceText.Encoding to a BOM-emitting UTF8Encoding, but the formatters discarded that information by calling Encoding.GetBytes(string), which never writes a preamble regardless of the encoding's configuration (see the remarks section here for more information).

Fix

There are a few changes here, not all of which are strictly necessary:

I decided to move where text encoding is performed, from the individual formatters and into CodeFormatters. So ICodeFormatter.FormatText now returns SourceText instead of byte[]. If there's any particular opposition to this, I'm happy to do it differently.
Added a new extension SourceTextExtensions.GetBytes() which serializes a SourceText to bytes using StreamWriter(stream, sourceText.Encoding). Unlike Encoding.GetBytes(string), StreamWriter writes the encoding's preamble (the BOM) at the start of the stream when the encoding specifies one.
CodeFormatters.FormatCode now calls result.GetBytes() on the returned SourceText to get the final byte output.

The result is that files with a UTF-8 BOM format have the BOM preserved, while files without one are unaffected.

michaelhthomas · 2026-06-08T18:40:14Z

I'm also willing to make this behavior configurable, #22 seems to indicate that it might be desired behavior in some cases.

fix: preserve BOM for UTF-8 encoded files

ee9aeae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: preserve BOM for UTF-8 encoded files#31

fix: preserve BOM for UTF-8 encoded files#31
michaelhthomas wants to merge 1 commit into
dprint:mainfrom
michaelhthomas:fix/bom

michaelhthomas commented Jun 8, 2026

Uh oh!

michaelhthomas commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

michaelhthomas commented Jun 8, 2026

Uh oh!

michaelhthomas commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant