Document that `.feature` files are assumed to be UTF-8 encoded

The Java and Perl implementations support an encoding pragma to define the encoding of a file (probably assuming an ASCII-compatible encoding so that they can read the pragma as ASCII).
They will then convert the string from that encoding to UTF-8 before creating the Source message and parsing it.
Support for this pragma is not covered by the shared testdata.

I did not find an implementation of this encoding pragma in other languages (I'm confident for the PHP implementation for which I reviewed the code for that, and the Github search did not give me results that would correspond to such feature in C, C++, Dart, .NET, Elixir, Go, Javascript, Objective-C, Python or Ruby).

Is this feature meant to be interoperable ?

Note that I'm not interested in using the feature myself (I would keep using UTF-8 in my projects anyway). I came here as one of the maintainers of the Behat PHP library. We are currently working of improving our `behat/gherkin` parser to reach parity, to give use the option to switch to the `cucumber/gherkin` PHP parser in a next major version of Behat. This case was identified as an parity gap (probably by comparing to the Java implementation), but we might reconsider this given that the PHP parser does not support it: https://github.com/Behat/Gherkin/issues/221

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Document that `.feature` files are assumed to be UTF-8 encoded #398

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Document that .feature files are assumed to be UTF-8 encoded #398

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Document that `.feature` files are assumed to be UTF-8 encoded #398