feat: schema based XML#1047
Open
dayaffe wants to merge 45 commits into
Open
Conversation
- Add XML traits (XmlName, XmlFlattened, XmlAttribute, XmlNamespace) to Smithy module - Create SmithyRestXML module with Serializer, Deserializer, Codec, HTTPClientProtocol, Plugin, BaseError - Add SmithyRestXMLTypes.kt and RestXMLPlugin.kt for Kotlin codegen - Update SerdeUtils to include RestXML in useSchemaBased() - Update RestXmlCustomizations with renderClientProtocol and plugins - Update RestXmlProtocolGenerator to remove schema-based middlewares
- Add renderClientProtocol and plugins to mock customizations - Remove OperationInputBodyMiddleware and DeserializeMiddleware in mock - Fixes 4 pre-existing test failures
- Add containerType: .structure to BaseError member schemas for correct memberName resolution - Fix xmlElementName fallback to check id.member before id.name - Remove unused XmlNameTrait import from HTTPClientProtocol
Instead of building XML via string concatenation, the Serializer now delegates to the existing SmithyXML.Writer which uses libxml2 for correct XML generation (encoding, entities, namespaces, etc). Three internal serializer types handle different contexts: - Serializer: top-level entry point, creates root Writer - MemberSerializer: writes struct members as child elements - ValueSerializer: writes values into list/map element nodes
- Delete 18 XML serde test files in protocolspecificserde/xml/ that tested +Write.swift/+Read.swift codegen output no longer generated by schema-based codegen - Change pagination.smithy, pagination-truncation.smithy, waiters.smithy, and waiters-none.smithy from @restXml to @restJson1 since those tests don't test protocol-specific behavior
Inline single-expression function bodies onto signature line to satisfy standard:function-signature rule. - RestXmlCustomizations.kt:18 (renderClientProtocol) - MockHTTPRestXMLProtocolGenerator.kt:27 (renderClientProtocol) - SerdeUtils.kt:11 (useSchemaBased delegating overload)
Resolve SerdeUtils.kt conflict by unioning the trait-check lists: schema-based applies to Rpcv2Cbor, AwsJson1_0, AwsJson1_1, and RestXml. Keep the epic/sbs two-overload signature (settings+model / private service).
RestXML is a REST-style protocol that uses HTTP URI/header/query bindings, unlike AwsJson and RpcV2Cbor. The generated client's OperationInputUrlPathMiddleware, OperationInputHeadersMiddleware, and OperationInputQueryItemMiddleware reference Input.urlPathProvider(_:), headerProvider(_:), and queryItemProvider(_:), so those static funcs must still be emitted on Input extensions even when serde is schema- based. Extend the guard to keep emitting these providers for RestXML.
The Deserializer was using Data("<empty/>".utf8) as a substitute for
empty response bodies, then parsing it via libxml2. For tiny payloads
Swift may use inline Data storage, creating a stack-pointer lifetime
issue with xmlBufferCreateStatic that causes xmlReadMemory to fail with
'The XML could not be parsed'.
Return an empty Reader directly for empty data instead. This resolves
~16 RestXML protocol tests where responses have no body (headers-only
outputs, EmptyInputAndEmptyOutput, NoInputAndNoOutput, etc.).
Expose Reader's default init as public under the existing
SmithyReadWrite SPI so Deserializer (in SmithyRestXML) can construct
an empty Reader.
…eserialization Schema-based RestXML is the first schema-based protocol that needs HTTP response bindings (headers, prefix headers, response code, httpPayload). Earlier schema-based protocols (AwsJson, CBOR) are RPC-style and never need them. The deserialize path previously passed only the body bytes through the XML Reader, silently leaving header-/status-/payload-bound output members unset (or throwing when a non-XML raw payload hit the parser). Add the HTTP binding trait types that the Swift schema needs: - HttpHeaderTrait (string value: header name) - HttpPrefixHeadersTrait (string value: header name prefix) - HttpResponseCodeTrait (marker) - HttpPayloadTrait (marker) Register them in AllSupportedTraits so Swift SchemasCodegen emits them into generated schemas. Extend Deserializer to optionally carry the HTTPResponse + raw body data. In readStruct, route each member through httpBindingDeserializer which inspects the member's schema traits and, if HTTP-bound, returns a synthetic child Deserializer sourced from the header value, status code, or raw body bytes (as applicable) instead of the XML reader. Normal body-bound members keep the existing XML element lookup path untouched. Expose Reader.init(content:) and Reader.addChild publicly under the existing SmithyReadWrite SPI so the Deserializer can synthesize a Reader wrapping a single header value or a list of split values. HTTPClientProtocol.deserializeResponse now constructs a binding-aware Deserializer directly (passing through the HTTPResponse + bodyData) instead of going through codec.makeDeserializer which only accepts body data. The error path is unchanged.
…P headers readList filters reader.children by XML element name, which correctly matches XML-sourced lists but returns empty for synthetic list Readers built from split header values (children have no XML element name). Add an isHeaderList flag so readList can enumerate all children directly when the list came from an HTTP header.
1. HTTP header timestamps default to http-date per Smithy spec (vs .dateTime for XML body). Route through an isFromHttpHeader flag on Deserializer so readTimestamp picks the right default when no @timestampFormat trait is present. 2. HttpPrefixHeadersTrait: replace TODO with real map construction. Synthesize a Reader tree of entry/key/value nodes matching what the existing readMap logic already consumes. Header names that match the prefix (case-insensitive) yield map entries with the stripped suffix as key and comma-joined values. Adds a public Reader(nodeInfo:content:) initializer under the existing SmithyReadWrite SPI to support synthesizing named Reader nodes.
…erializer/deserializer
…apes for correct XML root element
… empty list/map serialization
… HTTPClientProtocol
…d improve deserialization detection Two fixes: 1. Serialization: The Serializer's PayloadMemberSerializer did not implement writeDataStream(), so streaming blob payloads (e.g. S3 PutObject body) were silently dropped — the request body was empty. Added writeDataStream() to capture the ByteStream, and updated serializeRequest() to use it as the request body directly instead of the serialized XML data. 2. Deserialization: The hasStreamingPayload check now also checks member.hasTrait(StreamingTrait.self) in addition to member.target?.hasTrait(StreamingTrait.self), since SchemasCodegen merges target traits into member schemas.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue #
Description of changes
Scope
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.