Skip to content

Conversation

slinkydeveloper
Copy link
Contributor

Addressing the issues described here #3144

* Add `breaking` flag. This is needed when you want people to avoid users overwriting by mistake a deployment, but they still need to allow breaking changes (e.g. changing service type). `force` remains same as before, with same behavior.
* Change default for `force` to `false`.
* When registering the same service, without any flags set, 200 is returned to signal the request was idempotent (no change done). When force is set, 200 means overwrite.
* These changes are part of `AdminAPIVersion::V3` (to make sure CLI still behaves as before with /v2 prefix)
* Encapsulated the business logic of "semantic equality" between two deployments inside `Deployment::semantic_eq` and `DeploymentMetadata::semantic_eq`
* Apply the new experience to the CLI.
* This allows adding arbitrary metadata to deployments during registration, for observability.
* Auto-resolve the environment info for github actions. This should be enough to link the deployment back to the repository and related commit.
@muhamadazmy muhamadazmy self-requested a review October 6, 2025 07:58
This redesign is meant to cover a bunch of use cases/user flows:
* The common scenario of token/assume_role arn changed/expired, and all the invocations to a deployment are stuck. With this API, we can expose in the UI an edit box to update the deployment, changing the additional_headers/assume_role_arn
* The less common scenario when a user wants to update the address of an endpoint. This used to be your "only way out" using Lambda with stuck invocations before 1.5. From 1.5, with pause, we can resume on a different endpoint, so this user flow is now less relevant but we still support it here.
* The probably useless scenario where the user wants to forcefully overwrite the schema of an old deployment. This is still possible, but behind an opt-in flag.
@slinkydeveloper slinkydeveloper changed the title [WIP] Deployments improvements Deployments improvements Oct 6, 2025
@slinkydeveloper
Copy link
Contributor Author

@slinkydeveloper
Copy link
Contributor Author

Tested manually, all good

@slinkydeveloper
Copy link
Contributor Author

The e2e test rightfully fails, because it's depending on the old behavior of update_deployment. Will update that.

@slinkydeveloper
Copy link
Contributor Author

Fixing the e2e tests here restatedev/e2e#395

Copy link
Contributor

@muhamadazmy muhamadazmy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @slinkydeveloper for this PR. I left few questions

}

/// This returns true if the two deployments are to be considered the "same".
pub fn semantic_eq_with_address_and_headers(&self, other_addess: &DeploymentAddress) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two types DeploymentAddress and DeploymentType are suspiciously similar. Is this part of the planned clean up you are intending to do?

Also maybe a goo idea if DeploymentType has method address() which returns DeploymentAddress then implementation basically can use the same implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two types DeploymentAddress and DeploymentType are suspiciously similar. Is this part of the planned clean up you are intending to do?

They are similar, not exactly the same. But indeed I want to clean this up in the next iteration of cleanups with the upcoming PRs.

Ok(schema_updater.into_inner())
}

pub fn update_and_return<T, E>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully get the reason for this pattern. Why instead of passing an update_fn we instead do something like

let updater = schema.into_updater();
// do update operations
// here ...  
let schema = updater.commit()?; 
// .end() or .finish() but it only return an updated version of the schema with correct version.

Is there any particular reason it's done this way ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the lambda approach as it's more self contained, and exposes less methods outside. The .commit() approach to me makes sense if a commit actually happens at the end, in this case no commit happens, we just return the updated data structure.

Will hide this anyway behind SchemaRegistry, once i finish the next PR.

&service_name,
&new_service,
previous_service_revision,
RemovedHandlerBehavior::Warn,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's okay to change the service type here or remove a handler but not when doing add_deployment ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all a matter of defaulting and optimizing the common use case:

  • 99% of people using the update_deployment API won't use overwrite: true in the first place, as they just need to update the api token, or the assume_role. And the default reflects that: overwrite = false.
  • That 1% of people using that, will use it as an escape hatch in case you make some mistakes between registrations etc. In that case, it makes sense to do overwrite = true and you just take your risks.

In add_deployment, defaults are again leaning on safety first:

  • Most people doing properly immutable deployments want these safety checks -> by default those are enabled
  • When the safety check doesn't pass, and the user knows what is doing, they can provide --breaking
  • When the user does mutable deployments, they take their risks and go with --force

/// Declared SDK during discovery
pub sdk_version: Option<String>,
pub created_at: MillisSinceEpoch,
pub metadata: HashMap<String, String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find Metadata to have a field metadata is confusing. Maybe call it extra or user_defined ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to flatten this DeploymentMetadata into Deployment directly in the subsequent PRs. Also naming here is meant to align with service metadata and handler metadata.

@slinkydeveloper slinkydeveloper mentioned this pull request Oct 6, 2025
3 tasks
@slinkydeveloper
Copy link
Contributor Author

@muhamadazmy I applied the feedback, PTAL

@slinkydeveloper slinkydeveloper merged commit fafc03f into restatedev:main Oct 7, 2025
25 checks passed
@slinkydeveloper slinkydeveloper deleted the issues/3144 branch October 7, 2025 07:50
@github-actions github-actions bot locked and limited conversation to collaborators Oct 7, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants