Skip to content

[feature] Expose eXist serializer extensions through the standard output: namespace#6447

Open
joewiz wants to merge 2 commits into
eXist-db:developfrom
joewiz:feature/serialization-output-extensions-core
Open

[feature] Expose eXist serializer extensions through the standard output: namespace#6447
joewiz wants to merge 2 commits into
eXist-db:developfrom
joewiz:feature/serialization-output-extensions-core

Conversation

@joewiz

@joewiz joewiz commented Jun 6, 2026

Copy link
Copy Markdown
Member

[This PR was co-authored with Claude Code. -Joe]

Summary

Makes eXist's implementation-specific serialization parameters (EXistOutputKeysexpand-xincludes, highlight-matches, add-exist-id, process-xsl-pi, jsonp, insert-final-newline, etc.) settable uniformly through the standard W3C serialization namespace (http://www.w3.org/2010/xslt-xquery-serialization, prefix output) and as plain map keys, mirroring BaseX — across fn:serialize (both the options map(*) and the output:serialization-parameters element forms), file:serialize, and the declare option output:… prolog.

Until now these extensions could only be set through eXist's own exist: namespace (or the legacy declare option exist:serialize). The most important one, expand-xincludes, was therefore unreachable through the output: mechanisms clients actually use — so an open → edit → save round-trip via fn:serialize would silently expand and destroy <xi:include> elements. This unblocks safe XInclude-preserving reads in existdb-openapi (/api/db/resource's JSON-envelope path) and the same latent data-loss exposure in eXide.

This is the first of three PRs; a follow-up covers the REST interface (?_expand-xincludes=…) and a third covers RESTXQ %output:… annotations.

What changed

exist-core/.../xquery/util/SerializerUtils.java:

  • fn:serialize map form — eXist extension parameters are now resolved from a serialization options map under three key forms, in order of preference: the standard output: namespace (Q{…xslt-xquery-serialization}expand-xincludes), a plain string key ("expand-xincludes"), then the legacy exist: namespace (deprecated). Standard and extension parameters can therefore be mixed in a single string-keyed map without special-casing.
  • output:serialization-parameters element form — a child in the output: namespace whose local name is a known eXist extension is now accepted (previously rejected with SEPM0017).
  • The map type-check is relaxed so an xs:string value (e.g. "no") is accepted for an extension parameter, and boolean string parsing is aligned with BaseX (1/true/yes/on, case-insensitive).

The declare option output:… prolog path already threaded any output:-namespaced option through to the result serializer, so it needed no code change — it is now covered by tests.

Backward compatibility

  • The legacy exist: namespace forms (map QName keys, exist:-namespaced element children) and declare option exist:serialize remain fully supported, now marked deprecated in favor of the uniform output: form (in preparation for a future major-version removal).
  • With no serialization parameters supplied, output is byte-identical to before across every facility — the conf.xml <serializer> defaults are unchanged. Verified: a no-parameter fn:serialize still expands XIncludes by default, exactly as today.

Namespace policy (following BaseX)

The W3C spec asks implementations to define non-official parameters in a non-null namespace. BaseX deliberately waives this and accepts its extensions in the standard output: namespace for convenience across the many ways parameters are supplied; this PR follows BaseX. eXist's own exist: namespace remains accepted (deprecated), so nothing that relied on it breaks.

Test plan

  • serialize.xql (XQSuite) — 12 new cases: output: map QName key, plain string key (boolean and string value), output:serialization-parameters element child, a mixed standard + extension map asserted against exact expected output (proving method + indent + omit-xml-declaration + expand-xincludes all take effect together), add-exist-id via a plain key, and a default-expands guard. All existing cases still pass (131 total).
  • SerializationTest (JUnit, local + remote XML:DB) — declare option output:expand-xincludes yes/no on a stored XInclude document. Full class green (22).
  • file module serialize.xqm — an eXist extension (insert-final-newline) supplied through the output: namespace in file:serialize. Full FileTests green (54).
  • Confirmed on clean develop that none of the new forms worked (the element form even errored with SEPM0017); with this change all are honored and the default is unchanged.

Spec references

Downstream (unblocked by this PR)

…put: namespace

eXist's implementation-specific serialization parameters (EXistOutputKeys:
expand-xincludes, highlight-matches, add-exist-id, process-xsl-pi, jsonp, etc.)
could previously be set only through the eXist-specific exist: namespace (and the
legacy `declare option exist:serialize`). The highest-value of these,
expand-xincludes, was therefore unreachable through the normal output: mechanisms
used by clients, so an open->edit->save round-trip through fn:serialize could
silently expand and destroy `<xi:include>` elements.

Mirroring BaseX, make the eXist extensions settable uniformly through the standard
W3C serialization namespace (http://www.w3.org/2010/xslt-xquery-serialization) and
as plain map keys, alongside the W3C parameters, across the fn:serialize options
map and output:serialization-parameters element forms, file:serialize, and the
`declare option output:...` prolog path. Standard and extension parameters can now
be mixed in a single string-keyed map (e.g. method + indent + omit-xml-declaration
+ expand-xincludes) without special-casing.

The legacy exist: namespace forms (map QName keys, exist:-namespaced element
children, and `declare option exist:serialize`) remain supported for backward
compatibility, now marked deprecated in favor of the uniform output: form.

With no serialization parameters supplied, output is byte-identical to before
across every facility (the conf.xml <serializer> defaults are unchanged).

Changes:
- SerializerUtils: accept eXist extension parameters by local name in the W3C
  output: namespace and as plain string map keys; relax the map type check so an
  xs:string value (e.g. "no") is accepted for extension parameters; align boolean
  string parsing with BaseX (1/true/yes/on, case-insensitive).
- The `declare option output:...` prolog path already threaded any output:-namespaced
  option through to the result serializer, so it required no change; covered by tests.

Tests:
- serialize.xql: output: map QName key, plain string key (boolean and string value),
  output:serialization-parameters element child, a mixed standard+extension map with
  exact expected output, add-exist-id via plain key, and a default-expands guard.
- SerializationTest: `declare option output:expand-xincludes` yes/no on a stored
  XInclude document (local and remote XML:DB).
- file module serialize.xqm: an eXist extension (insert-final-newline) via the
  output: namespace in file:serialize.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +90 to +95
private static final Set<String> ExistParameterConventionLocalNames = new HashSet<>();
static {
for (final ExistParameterConvention existParameterConvention : ExistParameterConvention.values()) {
ExistParameterConventionLocalNames.add(existParameterConvention.getLocalParameterName());
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using stream(), map() and collect() will yield in a unmodifiable set instead of an modifiable HashSet instance.

Suggested change
private static final Set<String> ExistParameterConventionLocalNames = new HashSet<>();
static {
for (final ExistParameterConvention existParameterConvention : ExistParameterConvention.values()) {
ExistParameterConventionLocalNames.add(existParameterConvention.getLocalParameterName());
}
}
private static final Set<String> ExistParameterConventionLocalNames = ExistParameterConvention.values()
.stream()
.map(ExistParameterConvention::getLocalParameterName)
.collect(Collectors.toUnmodifiableSet());

Also the variable name should be all UPPERCASE as for the proposed Java naming for static constants.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[This response was co-authored with Claude Code. -Joe]

Thanks — done in 7c6ccf9. The constant is now built with stream().map(...).collect(Collectors.toUnmodifiableSet()) and renamed to EXIST_PARAMETER_CONVENTION_LOCAL_NAMES.

One small deviation from the suggestion: ExistParameterConvention.values() returns an array, which has no .stream(), so I used Arrays.stream(ExistParameterConvention.values()). Result is the same unmodifiable set.

…iable set

Address review feedback on eXist-db#6447: use a stream + Collectors.toUnmodifiableSet()
instead of populating a mutable HashSet in a static initializer, and rename the
constant to UPPER_SNAKE_CASE per Java naming conventions for static finals.

Note: ExistParameterConvention.values() returns an array, so Arrays.stream(...)
is used rather than .values().stream() from the review suggestion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants