Skip to content

feat: schema based XML#1047

Open
dayaffe wants to merge 45 commits into
epic/sbsfrom
day/schema-xml
Open

feat: schema based XML#1047
dayaffe wants to merge 45 commits into
epic/sbsfrom
day/schema-xml

Conversation

@dayaffe
Copy link
Copy Markdown
Contributor

@dayaffe dayaffe commented Apr 2, 2026

Issue #

Description of changes

Scope

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

dayaffe added 7 commits March 30, 2026 19:33
- Add XML traits (XmlName, XmlFlattened, XmlAttribute, XmlNamespace) to Smithy module
- Create SmithyRestXML module with Serializer, Deserializer, Codec, HTTPClientProtocol, Plugin, BaseError
- Add SmithyRestXMLTypes.kt and RestXMLPlugin.kt for Kotlin codegen
- Update SerdeUtils to include RestXML in useSchemaBased()
- Update RestXmlCustomizations with renderClientProtocol and plugins
- Update RestXmlProtocolGenerator to remove schema-based middlewares
- Add renderClientProtocol and plugins to mock customizations
- Remove OperationInputBodyMiddleware and DeserializeMiddleware in mock
- Fixes 4 pre-existing test failures
- Add containerType: .structure to BaseError member schemas for correct memberName resolution
- Fix xmlElementName fallback to check id.member before id.name
- Remove unused XmlNameTrait import from HTTPClientProtocol
Instead of building XML via string concatenation, the Serializer now
delegates to the existing SmithyXML.Writer which uses libxml2 for
correct XML generation (encoding, entities, namespaces, etc).

Three internal serializer types handle different contexts:
- Serializer: top-level entry point, creates root Writer
- MemberSerializer: writes struct members as child elements
- ValueSerializer: writes values into list/map element nodes
@dayaffe dayaffe requested a review from a team as a code owner April 2, 2026 18:12
dayaffe added 22 commits April 2, 2026 18:28
- Delete 18 XML serde test files in protocolspecificserde/xml/ that
  tested +Write.swift/+Read.swift codegen output no longer generated
  by schema-based codegen
- Change pagination.smithy, pagination-truncation.smithy, waiters.smithy,
  and waiters-none.smithy from @restXml to @restJson1 since those tests
  don't test protocol-specific behavior
Inline single-expression function bodies onto signature line to satisfy
standard:function-signature rule.

- RestXmlCustomizations.kt:18 (renderClientProtocol)
- MockHTTPRestXMLProtocolGenerator.kt:27 (renderClientProtocol)
- SerdeUtils.kt:11 (useSchemaBased delegating overload)
Resolve SerdeUtils.kt conflict by unioning the trait-check lists:
schema-based applies to Rpcv2Cbor, AwsJson1_0, AwsJson1_1, and RestXml.
Keep the epic/sbs two-overload signature (settings+model / private service).
RestXML is a REST-style protocol that uses HTTP URI/header/query
bindings, unlike AwsJson and RpcV2Cbor. The generated client's
OperationInputUrlPathMiddleware, OperationInputHeadersMiddleware, and
OperationInputQueryItemMiddleware reference Input.urlPathProvider(_:),
headerProvider(_:), and queryItemProvider(_:), so those static funcs
must still be emitted on Input extensions even when serde is schema-
based. Extend the guard to keep emitting these providers for RestXML.
The Deserializer was using Data("<empty/>".utf8) as a substitute for
empty response bodies, then parsing it via libxml2. For tiny payloads
Swift may use inline Data storage, creating a stack-pointer lifetime
issue with xmlBufferCreateStatic that causes xmlReadMemory to fail with
'The XML could not be parsed'.

Return an empty Reader directly for empty data instead. This resolves
~16 RestXML protocol tests where responses have no body (headers-only
outputs, EmptyInputAndEmptyOutput, NoInputAndNoOutput, etc.).

Expose Reader's default init as public under the existing
SmithyReadWrite SPI so Deserializer (in SmithyRestXML) can construct
an empty Reader.
…eserialization

Schema-based RestXML is the first schema-based protocol that needs HTTP
response bindings (headers, prefix headers, response code, httpPayload).
Earlier schema-based protocols (AwsJson, CBOR) are RPC-style and never
need them. The deserialize path previously passed only the body bytes
through the XML Reader, silently leaving header-/status-/payload-bound
output members unset (or throwing when a non-XML raw payload hit the
parser).

Add the HTTP binding trait types that the Swift schema needs:
- HttpHeaderTrait (string value: header name)
- HttpPrefixHeadersTrait (string value: header name prefix)
- HttpResponseCodeTrait (marker)
- HttpPayloadTrait (marker)

Register them in AllSupportedTraits so Swift SchemasCodegen emits them
into generated schemas.

Extend Deserializer to optionally carry the HTTPResponse + raw body
data. In readStruct, route each member through httpBindingDeserializer
which inspects the member's schema traits and, if HTTP-bound, returns a
synthetic child Deserializer sourced from the header value, status
code, or raw body bytes (as applicable) instead of the XML reader.
Normal body-bound members keep the existing XML element lookup path
untouched.

Expose Reader.init(content:) and Reader.addChild publicly under the
existing SmithyReadWrite SPI so the Deserializer can synthesize a
Reader wrapping a single header value or a list of split values.

HTTPClientProtocol.deserializeResponse now constructs a binding-aware
Deserializer directly (passing through the HTTPResponse + bodyData)
instead of going through codec.makeDeserializer which only accepts body
data. The error path is unchanged.
…P headers

readList filters reader.children by XML element name, which correctly
matches XML-sourced lists but returns empty for synthetic list Readers
built from split header values (children have no XML element name).

Add an isHeaderList flag so readList can enumerate all children directly
when the list came from an HTTP header.
1. HTTP header timestamps default to http-date per Smithy spec (vs .dateTime
   for XML body). Route through an isFromHttpHeader flag on Deserializer
   so readTimestamp picks the right default when no @timestampFormat trait
   is present.

2. HttpPrefixHeadersTrait: replace TODO with real map construction.
   Synthesize a Reader tree of entry/key/value nodes matching what the
   existing readMap logic already consumes. Header names that match the
   prefix (case-insensitive) yield map entries with the stripped suffix as
   key and comma-joined values.

Adds a public Reader(nodeInfo:content:) initializer under the existing
SmithyReadWrite SPI to support synthesizing named Reader nodes.
dayaffe and others added 16 commits April 22, 2026 15:49
…d improve deserialization detection

Two fixes:

1. Serialization: The Serializer's PayloadMemberSerializer did not implement
   writeDataStream(), so streaming blob payloads (e.g. S3 PutObject body)
   were silently dropped — the request body was empty. Added writeDataStream()
   to capture the ByteStream, and updated serializeRequest() to use it as the
   request body directly instead of the serialized XML data.

2. Deserialization: The hasStreamingPayload check now also checks
   member.hasTrait(StreamingTrait.self) in addition to
   member.target?.hasTrait(StreamingTrait.self), since SchemasCodegen merges
   target traits into member schemas.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant