-
Notifications
You must be signed in to change notification settings - Fork 12
Description
I am working on a Python implementation of RFC 9535 and I just discovered these compliance tests.
I seem to be failing many of the quoting tests and I'm not sure why. I am able to properly match string literals that represent Name-selectors and verify they are properly formatted regarding embedded quotes and the proper escapes. I know that I must unescape the strings before using them as Name-Selectors. But I am failing tests just as this one from name-selector.json:
{
"name": "name, double quotes, contains single quote",
"selector": "$[\"a'\"]",
"document": {
"a'": "A",
"b": "B"
},
"result": [
"A"
],
"result_paths": [
"$['a\\'']"
]
}
Expected :[["$['a\'']"]]
Actual :["$['a'']"]
My question (I think) is about setting up the test comparison. The document represents a string-serialized version of a JSON value. In Python I'm converting the document to a Python dict with the standard library method json.loads(document). This creates the Python memory version of the JSON object.
Since this entire json text is processed by json.loads(), it also processes the selector, and I verify that the json package has unescaped the double quotes. The in-memory version of the selector is now $["a'"], which will match the first key in the document when evaluated.
The in-memory value of the path (in my implementation) is actually just $[a']. Because the quotes are not part of the key name. When I want to look up the child node value of $ named a' that's the string I use, not a version that includes quotes around it. (i.e., not 'a'' nor "a'", just a' .
At this stage, the path string does not need to be escaped because it's just an in-memory string and not a string literal. If I had to write out this string to a text file as when serializing to JSON text, then I would have to make sure it was properly escaped. But if I wanted to use the path to access the value at that node, I would not want to escape it. Right? If I escaped this string then looked for it in the document dict, I would not find the key a\'. (Because the path reference now includes a backslash but the name key of the dict item does not.)
So, where am I going wrong? Is my result path (Actual above) supposed to include the escapes? If so then a user of this path would have to unescape it before trying to resolve it to access the node it represents. This doesn't seem like it should be the case as that is more work for the user.
Is my issue that the test case is assuming I will re-encode the result path before testing it against the Expected result?
TL;DR: my basic question is, do I have a flaw in my RFC 9535 implementation, or just in the way I am comparing a result path to the expected test case result path?
Thanks!