Test: code point escape#145
Open
jitsedesmet wants to merge 6 commits into
Open
Conversation
- Use codePointAt() instead of charCodeAt() per unicorn/prefer-code-point - Move inline comments to separate lines per line-comment-position rule - Shorten long test line per max-len rule - Add unterminated short-string test to maintain 100% branch coverage Note: The 5 failing W3C live tests (codepoint-esc-01/02/06/07/08) are old positive tests that PR #346 explicitly removes from the manifest. Implementing PR #346's restriction on codepoint escape placement is inherently incompatible with those old tests; they will be dropped when the PR is merged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per W3C PR #383/#384, UCHAR escapes (\uXXXX / \UXXXXXXXX) are no longer processed by a global query pre-processor. Instead they are handled at the grammar level, inside string literal and IRI reference tokens only. Changes: - Add UCHAR-aware lexer tokens (iriRef, stringLiteral1/2, long1/2) to sparql12LexerBuilder that replace the 1.1 variants - Add codepointEscape() to SparqlContext; default implementation (sparql12CodepointEscape) rejects all surrogate code points (U+D800–U+DFFF), including surrogate pairs - Override string grammar rule with two-pass decode: UCHAR first, then ECHAR; this correctly rejects \\u0041 (two backslashes → \A → invalid ECHAR) as required by codepoint-esc-bad-03 - Override iriFull grammar rule to apply codepointEscape to IRI content - Remove queryPreProcessor from Parser.ts; patch string and iriFull rules - Add lexResult.errors check in parserBuilder so queries with bare backslashes outside strings/IRIs throw rather than silently recover - Fix comment token pattern (no required trailing newline) so queries ending with a comment but no newline are accepted - Skip 5 dawgt:Proposed W3C tests that encode pre-PR-#383 behaviour and contradict the new grammar-level restriction - Regenerate source-tracked AST snapshots for codepoint-esc-05/06/07 (UCHAR sequences now remain unexpanded in lexed tokens) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add two negative test cases that exercise the error paths introduced by the two-pass UCHAR+ECHAR string decode in the SPARQL 1.2 grammar: - codepoint-esc-05-bad: \u005C decodes to \ (backslash), leaving a trailing unpaired \ at the end of the string literal → error - codepoint-esc-06-bad: \u005Cx decodes to \x, where x is not a valid ECHAR character → error Both paths were previously uncovered; coverage is back to 100%. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
@rubensworks Would it make sense to release this as an new minor version? Since it might 'break' some systems on SPARQL 1.2 (in that sense that the escaping has changed). I am hesitant to make it a major version because our API in a way did not break. We promise to implement a SPARQL 1.2 parser, but that parser is not 'stable' since the spec is not final... |
Member
|
I wouldn't worry too much about it. I'd just patch it. It's a relatively small change in any case. And only for 1.2. (If it would change something in 1.1, that would be something else IMO) I just skimmed the diff, and some tests are skipped in the package.json. Intentional? |
Member
Author
|
That should be removed, it was needed before the spec test pr was merged. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add tests added in: w3c/rdf-tests#346
Following updated spec:
w3c/sparql-query#383
and
w3c/sparql-query#384