Skip to content

Conversation

AhmedSoliman
Copy link
Contributor

@AhmedSoliman AhmedSoliman commented Oct 15, 2025

Adds bound_addresses and advertised_addresses to GetIdent responses to enable tools and automations to extract that information for future use.


Stack created with Sapling. Best reviewed with ReviewStack.

Copy link

github-actions bot commented Oct 15, 2025

Test Results

  7 files  +  2    7 suites  +2   4m 36s ⏱️ + 3m 17s
 49 tests + 15   49 ✅ + 15  0 💤 ±0  0 ❌ ±0 
210 runs  +158  210 ✅ +158  0 💤 ±0  0 ❌ ±0 

Results for commit cd926b7. ± Comparison against base commit 0dab81b.

This pull request removes 34 and adds 49 tests. Note that renamed tests count towards both.
dev.restate.sdktesting.tests.AwakeableIngressEndpointTest ‑ completeWithFailure(Client)
dev.restate.sdktesting.tests.AwakeableIngressEndpointTest ‑ completeWithSuccess(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$NewVersion ‑ createAwakeable(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$NewVersion ‑ startOneWayProxyCall(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$NewVersion ‑ startProxyCall(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$NewVersion ‑ startRetryableOperation(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$OldVersion ‑ completeAwakeable(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$OldVersion ‑ completeRetryableOperation(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$OldVersion ‑ proxyCallShouldBeDone(Client)
dev.restate.sdktesting.tests.BackCompatibilityTest$OldVersion ‑ proxyOneWayCallShouldBeDone(Client)
…
dev.restate.sdktesting.tests.CallOrdering ‑ ordering(boolean[], Client)[1]
dev.restate.sdktesting.tests.CallOrdering ‑ ordering(boolean[], Client)[2]
dev.restate.sdktesting.tests.CallOrdering ‑ ordering(boolean[], Client)[3]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromAdminAPI(BlockingOperation, Client, URI)[1]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromAdminAPI(BlockingOperation, Client, URI)[2]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromAdminAPI(BlockingOperation, Client, URI)[3]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromContext(BlockingOperation, Client)[1]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromContext(BlockingOperation, Client)[2]
dev.restate.sdktesting.tests.Cancellation ‑ cancelFromContext(BlockingOperation, Client)[3]
dev.restate.sdktesting.tests.Combinators ‑ awakeableOrTimeoutUsingAwaitAny(Client)
…

♻️ This comment has been updated with latest results.

This change reworks how we define and manage network ports, and achieves a bunch of goals in one go, at the cost of being quite big, of course.

## Unix-socket all the things

Restate server now supports listening on unix-sockets on all services (fabric, admin, ingress, and tokio-console). But even better, we listen on *both* the inet socket and unix-socket by default. Unix sockets get automatically created under the `restate-data/*.sock` and get cleaned up on shutdown (even if not, they are cleaned up on the next start). The unix-socket support include restatectl and restate CLIs. `restatectl -s unix:restate-data/admin.sock status` and setting env variables like `RESTATE_ADMIN_URL=unix:restate-data/admin.sock` will work for `restate svc status`. You can also use `curl` to call ingress like `curl --unix-socket restate-data/ingress.sock [http://local/Counter/123/add](http://test/Counter/123/add) --silent --json 1` (note that the hostname in the URL is ignored when connecting with unix-sockets.

Listening on unix-sockets can be disabled on all ports or on certain services, with a new option `listen-mode` that can be supplied in env-variable, config file, or `restate-server --listen-mode=tcp`. Listen modes support `tcp`, `unix`, and `all (default)`.  When using `unix` we’ll only listen on unix-sockets and all advertised addresses will automatically be derived to show the unix-socket address.

As a result of this change, all unit tests now use unix-sockets, no more port conflicts with your locally running services and potentially less flaky tests on CI. I’ve updated (and simplified) the local-cluster-runner utility to make use of it. There is more room for more improvements there still.

## Let’s talk about ports

**Random Ports**

Restate can now select random ports on startup by `restate-server --use-random-ports=true` or `RESTATE_USE_RANDOM_PORTS=true`. Those are conflict-free (os-selected) and because unix-sockets are also created, users (in the future) can use restate/restatectl to by pointing them to the unix-socket until they figure the ports. We print the advertised addresses for all services on startup and with the new `restate-server --no-logo` advertiseds address will be the first thing printed on stdout.

**Socket Activation**

Another cool feature is the support for LISTEN_FD/systemd compatible file-descriptor passing for listener sockets from the parent process to restate-server. A parent process can open the tcp listeners and even the unix-socket listeners (except for fabric port) and pass the file descriptors to restate-server (i.e. via systemd socket activation, or a utility like `systemfd` ).

for instance `systemfd --no-pid -s http::9000 -- restate-server` where `9000` becomes the ingress port. You can pass multiple ports and restate has a certain order to assign those ports.

*What does this bring to the table?*

1. Restarting restate-server without losing the socket listeners (ingress is the biggest winner), so clients will not observe connection errors during restart or upgrades.
2. Test harnesses, wrappers, or even our own tools can pre-allocate the tcp ports, and listen on them before starting restate. Those external wrappers don’t need to wait any more for restate-server to start before they try to connect to it (no connection retries needed). **This unlocks embedding restate in tests or shipping a restate-lite version that only listens on anonymous unix sockets**. In fact, we are a couple of steps away from making it possible to fully embed restate for use-cases that don’t need a server. The only small thing that’s missing is the invoker using a pre-supplied file-descriptor and/or support unix-sockets to connect to deployments.
3. Restate server will now attempt to bind on all required ports and unix-sockets very early in its startup, before starting any roles or opening the database. This reduces the downtime window, and allows us to centralize port assignment (for random) and gives us a nice place to print all addresses.

## Advertised Addresses

The PR unifies how we manage and configure advertised addresses for all services, it deprecates some of the old inconsistently named configuration keys (admin and ingress advertised addresses). But most importantly, restate-server will now attempt to detect a reasonable value for the advertised address. If the restate is listen mode is `unix` (only), it’ll now print `unix:/` advertised addresses, and if it’s tcp, it’ll try and detect the public routable IP address of the node instead of using `127.0.0.1` This makes docker deployments much nicer while maintaining to override all of them as needed. There is also a new option to override the hostname part of this address only without interfering with random ports `RESTATE_ADVERTISED_HOST=my_host.com` (or via cli, config) for global override, and it can be applied per service (`RESTATE_ADMIN__ADVERTISED_HOST=`). In fact, all new options can be overridden per service.

Additionally, all addresses and ports are now managed by a new component `AddressBook` that’s available via task-center. The address book is what powers handing off listeners down to services and it provides an interface to query all bound addresses such that we can return them in future `GetIdent` responses (not implemented yet).

A related improvement is how we configure `metadata-client` ’s `addresses`. Nodes now do not need to supply their own node advertised address in `addresses`. They only need to know about one or more of their peers but we'll now automatically include our own node if it's running a metadata server, thanks to early port binding and the `AddressBook`, this makes `addresses` field in config completely optional and for single-node setups, it's now empty by default.

This opens the door (not implemented) to adding support for `restate-server --peer=<address>` which would allow restate nodes to join a cluster and bootstrap completely by connecting to known peer address. This will let is figure its own correct advertised address, its metadata configuration without passing a configuration file.

## Misc

- New config options (global and with per-service override) `bind-port`, `bind-ip`, `advertised-host`, `use-random-ports` and `listen-mode`
- `restate-hyper-uds` crate to support using unix-socket with hyper clients, we should have a **config-gated** option to allow invoking deployments via unix-sockets too (any takers?)
- Port numbers, unix socket names, and service names are defined in a set of zero-cost types in `restate-types::net::address`
- Type-checked usage of addresses in all the codebase to denote which services they're meant to refer, this avoid confusion where a type like `AdvertisedAddress` didn't make it clear which service. You’ll see types like `AdvertisedAddress<AdminPort>` everywhere now.
- Documentation and configuration json schema express the service name and defaults according to the `ListenerPort` type parameter.

# What did we lose?

- For simplicity and to reduce confusion, unix-socket paths are not configurable anymore through `bind-address`, they will now be always created
under restate_data directory (fabric.sock, ingress.sock, admin.sock, and tokio.sock (if enabled). The socket files are deleted on process shutdown.
The benefit is that their locations are predictable for tools, users, and system operators.
`restatectl -s unix:restate_data/admin.sock status`
- Unix socket names have a limit of ~108 bytes in most unix systems, this puts a limit over the path length of restate-data, I've included a small optimization
that converts the path into relative if CWD is a prefix of the data-dir but this is not guaranteed solution. I'd say we evaluate how much
is this going to be a problem in practice and we can provide a configurable base directory for unix-sockets via config and env variable as needed. It's literally a single variable in AddressBook.
Adds bound_addresses and advertised_addresses to GetIdent responses to enable tools and automations to extract that information for future use.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant