Description
I have started writing a collection of schemas that violate some part of JSON schema's specifications or are in some way vexatious to an implementation. This is a topic that has come up a number of times over the years, and in the past couple days. My work on this is very initial, starting to integrate it with my own implementation's tests, but fleshed out enough to share for discussion.
I have a readme started here: https://github.com/notEthan/JSON-Schema-Vex-Suite/pull/1/files
The few schemas I've put together so far here: https://github.com/notEthan/JSON-Schema-Vex-Suite/pull/2/files
and a schema describing those files here: https://github.com/notEthan/JSON-Schema-Vex-Suite/pull/3/files
My goal is to have at least one nonconformant schema for every MUST/MUST NOT which apply to schemas of specs my own implementation supports. Other schemas will also fit - where a spec states behavior is undefined or implementation-defined, or where behavior is not forbidden but also not specified (e.g. pattern: "("
, $ref: "/nowhere"
)
This may be something that could be adopted by json-schema-org, or integrated into JSON-Schema-Test-Suite. It generally shares motivation and principles with the test suite, but I made no attempt to follow the format or structure here. Others will have opinions on decisions I made writing this collection; I'll describe decisions that seem relevant.
I wrote it in YAML, not JSON. As a human I find it nicer to read and SO much nicer to write. Most people whose opinions I'm aware of agree, but certainly not universal. I'd prefer to keep it YAML but it's not a hill I'd die on. There will be an automated check that all of the YAML is limited to JSON's data model.
The structure is pretty different from JSON-Schema-Test-Suite. I have, at the moment:
- One flat directory, no per-spec/draft directories. Each file indicates which spec drafts its cases apply to. This eliminates a lot of the repetition in JSON-Schema-Test-Suite.
- Each file is intended to express restrictions on one thing (e.g. the value of a schema's id), which may have multiple directives (e.g. The value for this keyword MUST be a string, and MUST be a valid URI). These directives are stated in the file, for humans to understand.
- Each file has a number of schemas which violate these directives.
- The file lists each specification which states these directives - the listed schemas are not conformant to these specifications. This is inside an array called
cases
; when the directives vary by specification (e.g. the keyword beingid
or$id
), thecases
array allows multiple combinations of specifications and applicable schemas. - Nonconformant schemas are uniquely identified using
schemas
object property name. This is a dot-separated string, with each part indicating some scope, broadest to narrowest. This part feels a little underdeveloped, but the intention is that implementations can filter using these property names to group similar behaviors. e.g.keyword.$schema.type.null
andkeyword.$schema.type.integer
uniquely identify schemas, butkeyword.$schema.type.*
would likely all be handled the same.
I am interested in feedback and thoughts on how people see this in relation to the JSON Schema organization.