Skip to content

Document that .feature files are assumed to be UTF-8 encoded #398

@stof

Description

@stof

The Java and Perl implementations support an encoding pragma to define the encoding of a file (probably assuming an ASCII-compatible encoding so that they can read the pragma as ASCII).
They will then convert the string from that encoding to UTF-8 before creating the Source message and parsing it.
Support for this pragma is not covered by the shared testdata.

I did not find an implementation of this encoding pragma in other languages (I'm confident for the PHP implementation for which I reviewed the code for that, and the Github search did not give me results that would correspond to such feature in C, C++, Dart, .NET, Elixir, Go, Javascript, Objective-C, Python or Ruby).

Is this feature meant to be interoperable ?

Note that I'm not interested in using the feature myself (I would keep using UTF-8 in my projects anyway). I came here as one of the maintainers of the Behat PHP library. We are currently working of improving our behat/gherkin parser to reach parity, to give use the option to switch to the cucumber/gherkin PHP parser in a next major version of Behat. This case was identified as an parity gap (probably by comparing to the Java implementation), but we might reconsider this given that the PHP parser does not support it: Behat/Gherkin#221

Metadata

Metadata

Assignees

No one assigned

    Labels

    📖 documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions