Skip to content

feat: add sync stream context, deprecate GetMetadata#183

Merged
toddbaert merged 2 commits into
mainfrom
feat/sync-context-and-type
Mar 19, 2025
Merged

feat: add sync stream context, deprecate GetMetadata#183
toddbaert merged 2 commits into
mainfrom
feat/sync-context-and-type

Conversation

@toddbaert
Copy link
Copy Markdown
Member

@toddbaert toddbaert commented Mar 9, 2025

This PR updates the sync schema. It:

  • deprecates the GetMetadata rpc and its messages
  • includes a new, optional sync_context field which conveys the same data as the now-deprecated GetMetadata rpc
    • optional makes it easy for providers to absorb this change in a non-breaking way, since it generates code which allows us to check if the the field is explicitly set or not, and if not, fall back to the old GetMetadata
  • includes a new, required type field, exactly equivalent to the type field in the EventStream
    • this field must contain provider_ready on the first stream payload, and configuration_change in subsequent payloads
    • this field is required, meaning this is a breaking change for servers (they will need to be updated to check this field)
    • this allows providers to commonize more logic between sync/event stream implementations, and remove some state maintenance associated with sync stream implementations

Relates to: open-feature/flagd#1584
Relates to: open-feature/flagd#1585

In general, I'm wondering what people think of the name sync_context for the new field carrying the static context from flagd. My argument for sync_context is simple, in that it carries the auxiliary context from server and it's delivered via the sync stream. static_context might also work, but it might not be "static" in some implementations and could change in a new stream payload (though it won't in flagd). GetMetadata was poorly named, and I'm hoping this is clearer.

@cupofcat I believe you have your own server implementation, so I'm interested in your thoughts on both these changes.

Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
@tangenti
Copy link
Copy Markdown

Just a side note here - required has been completely removed from proto3 and everything is optional (either explicit or implicit).

More details in the Specifying Field Cardinality section in https://protobuf.dev/programming-guides/proto3/

@toddbaert
Copy link
Copy Markdown
Member Author

Just a side note here - required has been completely removed from proto3 and everything is optional (either explicit or implicit).

More details in the Specifying Field Cardinality section in https://protobuf.dev/programming-guides/proto3/

@tangenti

Yes; I meant that it's logically required for implementations (as in, our provider will not work properly if no data is sent in that field, if this proposal is merged).

@cupofcat
Copy link
Copy Markdown

Wouldn't it be valid for that field to contain empty data? If yes (empty data is fine), shouldn't we make it completely optional and treat missing as empty?

@toddbaert
Copy link
Copy Markdown
Member Author

toddbaert commented Mar 10, 2025

Wouldn't it be valid for that field to contain empty data? If yes (empty data is fine), shouldn't we make it completely optional and treat missing as empty?

Not logically in providers. Currently, our RPC mode uses the type field to know whether to emit READY or CONFIGURATION_CHANGED. If it's neither no events are emitted, it's unhanded, like this. We would be implementing the same logic in the sync.stream going forward if this PR is merged, which would allow us to remove some state in our provider which tracks whether this is the first message or not.

@toddbaert
Copy link
Copy Markdown
Member Author

toddbaert commented Mar 10, 2025

Wouldn't it be valid for that field to contain empty data? If yes (empty data is fine), shouldn't we make it completely optional and treat missing as empty?

Not logically in providers. Currently, our RPC mode uses the type field to know whether to emit READY or CONFIGURATION_CHANGED. If it's neither no events are emitted, it's unhanded, like this. We would be implementing the same logic in the sync.stream going forward if this PR is merged, which would allow us to remove some state in our provider which tracks whether this is the first message or not.

We could always emit both as a default case in our providers, actually. Which would be non-breaking for servers that don't implement it and similar to the current behavior.

cc @aepfli on this one to verify my thinking.

@cupofcat
Copy link
Copy Markdown

I would like to understand a bit better the semantic of the sync_context field. It's not 100% clear to me what the proposal is:

  1. The client looks first in the sync_context field of the FetchAllFlagsResponse / SyncFlagsResponse. If it's not there, then it issues GetMetadata request and expects it to return something?
  2. Both GetMetadata and sync_context are fully optional from the client perspective and, if both are missing, empty metadata is assumed (so the client can still cascade through both but treat missing as empty)

I think (2) would make sense. To me, if metadata is OK to be empty, then it should be also OK to omit it entirely. WDYT?

@toddbaert
Copy link
Copy Markdown
Member Author

toddbaert commented Mar 11, 2025

I would like to understand a bit better the semantic of the sync_context field. It's not 100% clear to me what the proposal is:

  1. The client looks first in the sync_context field of the FetchAllFlagsResponse / SyncFlagsResponse. If it's not there, then it issues GetMetadata request and expects it to return something?
  2. Both GetMetadata and sync_context are fully optional from the client perspective and, if both are missing, empty metadata is assumed (so the client can still cascade through both but treat missing as empty)

I think (2) would make sense. To me, if metadata is OK to be empty, then it should be also OK to omit it entirely. WDYT?

@cupofcat

2 is the proposal, going forward. Eventually the GetMetadata RPC will be completely deprecated. Up to now, it's been expected, and we treated errors with it as exceptional.

To sum it up, my proposal is:

  • For a few provider versions*, to maintain behavioral backwards compatibility, we will check the new sync_context field, and also the GetMetadata RPC, in that order, and fail if we can't get the data we want from at least one. The only exception would be if the new syncMetadataDisabled is set to true, in which case we don't make the GetMetadata call.
    • we will print some warning messages if we see GetMetadata returns data while the sync_context is empty
  • after a few provider versions* we will completely remove the GetMetadata call, as well as the syncMetadataDisabled option and purely rely on the optional sync_context. If that's empty, we will assume the server has no important context to add to evaluations and proceed as normal

I'd prefer not to immediately ignore a failure to successfully call GetMetadata, since doing so would possibly compromise the evaluations of existing implementations. We could possibly look for some specific error codes (ie: NOT_IMPLEMENTED) and ignore them as an alternative.

How does that sound?

Additionally, do you have any concerns about the type field? Specifically, my comment here.

@cupofcat
Copy link
Copy Markdown

I think the plan for sync_context and GetMetadata sounds OK.

Re: type field: I don't know enough about how the rpc provider works to have an educated opinion here. I don't fully understand why we need to differentitate between provider ready and configuration changes. I would expect that, if the client needs to do something once, when the connection is first established, it could just treat the very first event specially (so maintain state on the client side). However, from the other comments, I see that your goal is explicitly to remove state from the client and have the server maintain state for each client. I wonder if you could expand on the tradeoffs here?

Lastly, I would like to add that we probably should figure out a better process for breaking API/proto changes (e.g. have concrete defintions for "a few provider version", etc). WDYT?

@aepfli
Copy link
Copy Markdown
Member

aepfli commented Mar 12, 2025

the tradeoff is a simple one, either we manage state in each language sdk, or we manage state in flagd. We are firing events, and for the first connect it is an ready event, each configuration change after that is a configuration change event. If we do reach STALE or ERROR state we again start with a ready event, and afterwards with a configuration change event.

Implementing the same logic over and over again, just adds to the technical debt and complexity, where as if flagd controls this approach, we do not need to think about this state, we can just react to the proper events.

Another option would be to add this behavior as general behavior to the openfeature sdks, to handle the state for all providers, so none of the provider sdks need to tackle this topic. but i think it is easier to normalize this behavior for rpc and sync - and normalize the implementations regarding this topic.

@toddbaert
Copy link
Copy Markdown
Member Author

toddbaert commented Mar 12, 2025

Lastly, I would like to add that we probably should figure out a better process for breaking API/proto changes (e.g. have concrete defintions for "a few provider version", etc). WDYT?

We actually have some pretty rigorous policies for breaking changes, but they mostly apply to 1.0+ artifacts (which flagd and it's providers are not yet). Once flagd is 1.0, this sort of thing will be more rigorous. Personally, I'd like to be 1.0 this summer, but when you 1.0 is a tough question due to "unknown unknowns".

Re: type field: I don't know enough about how the rpc provider works to have an educated opinion here. I don't fully understand why we need to differentitate between provider ready and configuration changes. I would expect that, if the client needs to do something once, when the connection is first established, it could just treat the very first event specially (so maintain state on the client side). However, from the other comments, I see that your goal is explicitly to remove state from the client and have the server maintain state for each client. I wonder if you could expand on the tradeoffs here?

You are exactly right. The tradeoff is simple - more state in client vs more state in server, basically, as @aepfli says:

the tradeoff is a simple one, either we manage state in each language sdk, or we manage state in flagd. We are firing events, and for the first connect it is an ready event, each configuration change after that is a configuration change event. If we do reach STALE or ERROR state we again start with a ready event, and afterwards with a configuration change event.

I'm open to going either direction. We just need to decide. As of now, I think there's a very limited number of server implementations (flagd, our internal Dynatrace implementation, and yours @cupofcat) so all the stakeholders are probably in this thread.

@tangenti
Copy link
Copy Markdown

Do we have a roadmap for 1.0?

@toddbaert
Copy link
Copy Markdown
Member Author

toddbaert commented Mar 12, 2025

Do we have a roadmap for 1.0?

No but we could add a parent issue here if we agree it's a medium-term goal.

Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
@toddbaert
Copy link
Copy Markdown
Member Author

toddbaert commented Mar 13, 2025

@tangenti @cupofcat

I spoke with @aepfli and we are OK with dropping the changes around type. We will not change the current behavior with respect to that. So this change now just amounts to deprecating the GetMetadata call and adding a fully optional property containing the same payload, which can be safely omitted from responses if the server doesn't want the provider to add in any additional context.

I've updated the PR to only include the sync_context changes. If you folks are OK with this, please approve.

@toddbaert toddbaert changed the title feat!: add sync stream type and context feat: add sync stream type and context Mar 13, 2025
@toddbaert
Copy link
Copy Markdown
Member Author

@cupofcat @tangenti do you have any objections to merging this as is?

Copy link
Copy Markdown

@cupofcat cupofcat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay here! LGTM

@toddbaert toddbaert merged commit 9b0ee43 into main Mar 19, 2025
@toddbaert toddbaert deleted the feat/sync-context-and-type branch March 19, 2025 19:10
toddbaert pushed a commit that referenced this pull request May 29, 2025
🤖 I have created a release *beep* *boop*
---


<details><summary>protobuf: 0.6.4</summary>

##
[0.6.4](protobuf-v0.6.3...protobuf-v0.6.4)
(2025-04-10)


### ✨ New Features

* add sync stream type and context
([#183](#183))
([9b0ee43](9b0ee43))


### 📚 Documentation

* fix url for flags.json
([#182](#182))
([e840a03](e840a03))
</details>

<details><summary>json/json-schema: 0.2.11</summary>

##
[0.2.11](json/json-schema-v0.2.10...json/json-schema-v0.2.11)
(2025-04-10)


### 📚 Documentation

* clarify that bucketing keys are strings
([#185](#185))
([c707f56](c707f56))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Signed-off-by: OpenFeature Bot <109696520+openfeaturebot@users.noreply.github.com>
@toddbaert toddbaert changed the title feat: add sync stream type and context feat: add sync stream context, deprecate GetMetadata May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants