Skip to content

MNE-BIDS should not include a BOM when writing TSV files #1530

@scott-huberty

Description

@scott-huberty

From #1528 .. in #625 MNE-BIDS started writing text files with encoding "utf-8-sig". However the BIDS spec only requires 'utf-8' for tabular files:

Haha funny, so I initially didn't want to have a BOM but then we decided otherwise :)

I did a bit of digging online and it seems that NOT having the BOM is the "correct" way of doing things. And newer versions of Excel (at least in M365) read non-BOM'd TSV files correctly! In fact, a quick test shows actually that the situation has sort of reversed! – not having a BOM is BETTER for Excel, at least when reading TSVs, as can be seen here:

# test.py
#
# /// script
# requires-python = ">=3.14"
# dependencies = [
#     "pandas>=3.0.1",
# ]
# ///
import pandas as pd

data = pd.DataFrame.from_dict({
    "Foo": [1, 2, 3],
    "Bar": [4, 5, 6],
    "Baz": [7, 8, 9]
})

data.to_csv("data.tsv", index=False, encoding="utf-8", sep="\t")
data.to_csv("data-bom.tsv", index=False, encoding="utf-8-sig", sep="\t")

Run via:

uv run test.py

No BOM – loads correctly upon a double-click on the file:
image

With BOM – does not load correctly:
image

So I would say we drop the BOM. How to handle this potentially breaking change (then again, who would really be affected?) – I leave up to @sappelhoff to decide 😃

Originally posted by @hoechenberger in #1528 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions