fix: Parse tags correctly in gherkin-32 compatibility mode#400
fix: Parse tags correctly in gherkin-32 compatibility mode#400acoulton merged 3 commits intoBehat:masterfrom
gherkin-32 compatibility mode#400Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #400 +/- ##
============================================
+ Coverage 95.60% 95.66% +0.06%
- Complexity 675 682 +7
============================================
Files 44 45 +1
Lines 1980 2009 +29
============================================
+ Hits 1893 1922 +29
Misses 87 87
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
| $tag = str_replace('@', '', trim($tag)); | ||
|
|
There was a problem hiding this comment.
We no longer need to remove the @ from the tag filter expression, because the tags we're comparing to will now always have their leading @.
| $tags = preg_split('/(?=@)/u', $line); | ||
| assert($tags !== false); | ||
| // Remove the empty content before the first tag prefix | ||
| array_shift($tags); | ||
|
|
||
| // Note: checking for whitespace in tags is done in the Parser to fit with existing logic | ||
| $token['tags'] = array_map(trim(...), $tags); |
There was a problem hiding this comment.
This broadly follows the logic in cucumber/gherkin except I have used a lookahead regex to save having to add the @ back onto the split parts. It also omits checking for whitespace here as it seemed more logical to put that alongside the existing deprecation in the Parser, at least until we remove that.
|
|
||
| trigger_error( | ||
| sprintf('Whitespace in tags is deprecated, found "%s"', $tag), | ||
| sprintf('Whitespace in tags is deprecated, found "%s" in %s', $tag, $this->file ?? 'unknown file'), |
There was a problem hiding this comment.
I noticed the deprecation message did not include the name of the file (if available) - I've added that to help users trace any of these before enabling the new mode.
tests/Cucumber/NDJsonAstParser.php
Outdated
| * @return list<FeatureNode> | ||
| */ | ||
| public function load(string $resource): array | ||
| public function load(string $resource, GherkinCompatibilityMode $compatibilityMode): array |
There was a problem hiding this comment.
should we use a setter like for the Parser, to avoid having to pass it around everywhere ?
There was a problem hiding this comment.
Good plan, that's much cleaner. It took me a moment to notice that although load is an instance method, the rest of the implementation it calls was all in static methods.
In theory for our current usage of the class I could have used a static setter & static property to hold the mode, but that felt like a bit of a hack. So I've converted all the static methods / calls to instance methods (almost all of them are involved in a call chain that ultimately needs to call getTags). That'll make it cleaner / easier to access the mode from other parts of the loader in future if we need to.
6cc9cf2 to
501fbc4
Compare
cucumber/gherkin parses tags exactly as they appear in the file, including the leading `@` which we have historically stripped. Make our parsing the same as cucumber's when in gherkin-32 compatibility mode.
When parsing tags, we (and cucumber/gherkin) split on the `@` character. So `@some tag @values` will parse as `['@some tag', '@value']`. This may be unexpected if e.g. a user has simply missed a `@` character. Therefore cucumber/gherkin will throw if a tag contains any whitespace. Historically we have accepted this, although we have triggered a deprecation since v4.9.0. Update the parser to throw if in `gherkin-32` compatibility mode.
Update the filtering to match tags regardless of the gherkin parsing mode.
501fbc4 to
2972973
Compare
When parsing in
gherkin-32compatibility mode:@will be retained in the tag value in the parsed nodes (in legacy mode, we remove this).ParserExceptionif a tag contains whitespace (in legacy mode, we trigger a deprecation).To accommodate this, TagFilter will now match tags with or without the leading
@so that it works as expected in either parsing mode.This may produce a behavioural change (for users who have opted-in to the new parsing mode) if they are interacting with the tags array directly. There is no change to behaviour in legacy mode.
Fixes #336.