feat(rest): expose canonicalization endpoints#8336
Conversation
Add two new ways to retrieve canonicalized schema content:
1. POST /apis/registry/v3/system/canonicalize?artifactType=<type>
Accepts arbitrary schema content and returns its canonicalized form.
Supports all artifact types (JSON, OpenAPI, Avro, Protobuf, etc.).
2. GET .../versions/{versionExpression}/content?canonical=true
New query parameter on the existing content retrieval endpoint that
returns stored artifact content in its canonical form.
Closes #2473
Signed-off-by: Carles Arnal <carnalca@redhat.com>
|
PR auto-accepted (trusted author). Full test suite will run. A maintainer can use |
|
The test suite was cancelled for commit c6cfd78. See the workflow run. Use |
|
The test suite failed for commit c6cfd78. @carlesarnal, please check the workflow run and push a fix. |
This comment has been minimized.
This comment has been minimized.
Signed-off-by: Carles Arnal <carnalca@redhat.com>
Signed-off-by: Carles Arnal <carnalca@redhat.com>
|
@EricWittmann any other better place than system for the endpoint to given a content, return it canonicalized? admin? anything else? |
|
This PR has been inactive for 7 days. @carlesarnal, please update or comment |
|
I don't think I would use |
|
I think the right home for this is |
| Map<String, TypedContent> resolvedReferences = RegistryContentUtils | ||
| .recursivelyResolveReferences(artifactCell.get().getReferences(), | ||
| storage::getContentByReference); | ||
| contentToReturn = contentUtils.canonicalizeContent(metaData.getArtifactType(), |
There was a problem hiding this comment.
Canonicalization runs after handleContentReferences() here — so the canonical output depends on the references query parameter. Do we need a test that exercises both at the same time (e.g. ?references=DEREFERENCE&canonical=true) to make sure the interaction produces the expected result?
|
|
||
| @Override | ||
| @Authorized(style = AuthorizedStyle.None, level = AuthorizedLevel.Read) | ||
| public Response canonicalizeContent(String artifactType, InputStream data) { |
There was a problem hiding this comment.
There is no validation that artifactType is a known type before passing it to canonicalizeContent(). If someone sends artifactType=GARBAGE, the behavior depends on how the artifact type provider lookup handles unknown types. Should we validate early and return a clear 400?
| ContentHandle content = ContentHandle.create(data); | ||
| if (content.bytes().length == 0) { | ||
| throw new BadRequestException("Empty content is not allowed."); | ||
| } |
There was a problem hiding this comment.
Using @Context HttpServletRequest to get the content type feels inconsistent with the rest of the codebase — other endpoints get it from the generated JAX-RS interface (e.g. @HeaderParam or @Consumes). This also couples the implementation to the Servlet API. Could the content type come through the generated interface instead?
|
|
||
| if (references == null) { | ||
| // Check if admin has configured a default reference handling behavior | ||
| java.util.Optional<String> configuredDefault = restConfig.getDefaultReferenceHandling(); |
There was a problem hiding this comment.
A few existing comments were removed in this method (the "Check if admin has configured..." comment here, the "No configuration..." comment below, and the "Throw 404..." comment further down). Was this intentional cleanup or accidental?
| } | ||
|
|
||
| @Test | ||
| public void testCanonicalizeAvroContent() { |
There was a problem hiding this comment.
The OpenAPI and Avro tests only assert that certain fields exist in the response (e.g. components is not null, openapi equals 3.0.2) — they don't verify that canonicalization actually happened (e.g. keys are sorted, whitespace is normalized). The JSON test does this well with an exact string comparison. Should these tests have similarly strong assertions to confirm the output is actually in canonical form?
There was a problem hiding this comment.
Yes, they must do it.
- Validate artifactType early and return 400 for unknown types - Replace @context HttpServletRequest with @context HttpHeaders (JAX-RS) - Strengthen OpenAPI and Avro test assertions with exact string comparison - Add test for unknown artifact type returning 400 - Add test for references + canonical query param interaction Signed-off-by: Carles Arnal <carnalca@redhat.com>
This comment has been minimized.
This comment has been minimized.
Move the canonicalize endpoint to POST /content/canonicalize as suggested in review — it's a stateless content utility that fits alongside the existing /content/references endpoint. - Move implementation from SystemResourceImpl to ContentResourceImpl - Move tests from SystemResourceTest to ContentResourceTest - Restore SystemResourceImpl to its original state - Regenerate Go SDK Signed-off-by: Carles Arnal <carnalca@redhat.com>
|
|
Verify — ✅ passed (run)
Change detectionjava: |



Summary
Expose schema canonicalization as a first-class REST API feature. Users can now retrieve the canonical form of schemas for comparison purposes — both for arbitrary content and for stored artifact versions.
Fixes #2473
Root Cause
Canonicalization was used only internally for content hash matching (e.g. the
canonicalquery parameter onPOST /groups/{groupId}/artifactsforifExistslogic). There was no way to GET the canonicalized form of a schema, which users need for comparing schemas produced by different tools or environments.Changes
common/src/main/resources/META-INF/openapi.json: AddedPOST /system/canonicalizeendpoint with requiredartifactTypequery parameter. Addedcanonicalboolean query parameter toGET /groups/{groupId}/artifacts/{artifactId}/versions/{versionExpression}/content.app/.../rest/v3/impl/SystemResourceImpl.java: ImplementedcanonicalizeContent()— reads the request body, delegates toRegistryStorageContentUtils.canonicalizeContent(), and returns the canonical form. Auth level:Read.app/.../rest/v3/impl/GroupsResourceImpl.java: ExtendedgetArtifactVersionContent()with the newBoolean canonicalparameter. Whentrue, resolves references from storage and canonicalizes the content before returning it.app/src/test/.../noprofile/rest/v3/SystemResourceTest.java: Added 6 new tests covering JSON, OpenAPI, and Avro canonicalization; empty body error case; stored content canonical retrieval; and a regression test confirming non-canonical retrieval is unchanged.Test plan
POST /system/canonicalizewith JSON content (keys reordered)POST /system/canonicalizewith OpenAPI contentPOST /system/canonicalizewith Avro contentPOST /system/canonicalizewith empty body → 400GET .../content?canonical=truefor stored JSON artifactGET .../contentwithout canonical param (regression — unchanged behavior)