Skip to content

Evolving a Struct into a Union #968

@evelyno-rigetti

Description

@evelyno-rigetti

Description

I'd like to evolve a Struct to a union of Structs without breaking users of the original Struct. As far as I can tell, this is not supported by the existing schema evolution mechanism. I propose to do this by allowing Structs to define tag=msgspec.UNSET.

Example

Let's say I have a service that accepts a Request of the following shape:

class Request(msgspec.Struct):
    a: int

# server decoding logic
req = msgspec.json.decode(buf, type=Request)

Now I want to allow this service to accept a union of two different request types, so I modify my service to look like:

class RequestV0(msgspec.Struct, tag=True):
    a: int

class RequestV1(msgspec.Struct, tag=True):
    b: str
    c: str

# server decoding logic
req = msgspec.json.decode(buf, type=RequestV0 | RequestV1)

What I would like is the ability to decode the old Request as the new RequestV0. It's possible to discriminate correctly, because all of the old Request messages will have an unset tag field while all of the new union members will have the tag field set.

buf = msgspec.json.encode(Request(a=4))
req_v0 = msgspec.json.decode(buf, type=RequestV0 | RequestV1)
# raises msgspec.ValidationError: Object missing required field `type`

Existing solutions

One way to avoid this situation today is to defensively set tag=True on all Requests, so that they can be evolved in the future. That's fine in this small example, but if you have something like a long list of Structs, then this will needlessly inflate the wire size.

As far as I can tell dec_hook is not called in this situation.

Proposal

My proposed feature is to allow tag=msgspec.UNSET. The new server logic looks like:

class RequestV0(msgspec.Struct, tag=msgspec.UNSET):   # today, raises TypeError: `tag` must be a `str` or an `int`
    a: int

class RequestV1(msgspec.Struct, tag=True):
    b: str
    c: str

# server decoding logic
req = msgspec.json.decode(buf, type=RequestV0 | RequestV1)

Now the server's decoding logic, upon receiving JSON without the type field set, knows to decode it as a RequestV0.

buf = msgspec.json.encode(Request(a=4))
req_v0 = msgspec.json.decode(buf, type=RequestV0 | RequestV1)
# successfully returns a RequestV0

Given the discussion on #720, I would propose to make tag=msgspec.UNSET exclusive with array_like=True. Depending on the approach chosen there, it may be possible to support both options by dispatching on the length of the array.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions