-
-
Notifications
You must be signed in to change notification settings - Fork 82
Description
The Java and Perl implementations support an encoding pragma to define the encoding of a file (probably assuming an ASCII-compatible encoding so that they can read the pragma as ASCII).
They will then convert the string from that encoding to UTF-8 before creating the Source message and parsing it.
Support for this pragma is not covered by the shared testdata.
I did not find an implementation of this encoding pragma in other languages (I'm confident for the PHP implementation for which I reviewed the code for that, and the Github search did not give me results that would correspond to such feature in C, C++, Dart, .NET, Elixir, Go, Javascript, Objective-C, Python or Ruby).
Is this feature meant to be interoperable ?
Note that I'm not interested in using the feature myself (I would keep using UTF-8 in my projects anyway). I came here as one of the maintainers of the Behat PHP library. We are currently working of improving our behat/gherkin parser to reach parity, to give use the option to switch to the cucumber/gherkin PHP parser in a next major version of Behat. This case was identified as an parity gap (probably by comparing to the Java implementation), but we might reconsider this given that the PHP parser does not support it: Behat/Gherkin#221