Skip to content

Conversation

AhmedSoliman
Copy link
Contributor

@AhmedSoliman AhmedSoliman commented Oct 16, 2025

A developer-focused version of restate that's embedded into the restate CLI. It starts restate on an ephemeral temporary directory that's auto-deleted after Ctrl+C.

  1. Supports --use-random-ports
  2. Emits very clean output, it doesn't show the server log. Just a table of addresses.
  3. Opens the admin UI automatically on startup in the browser
  4. Runs the Counter service on a random port and auto-registers it by default so you can play with the UI immediately with that service.
  5. Supports --retain to persist the temporary directory (meant to be used in debugging) and currently it doesn't support choosing your own directory

Stack created with Sapling. Best reviewed with ReviewStack.

This change reworks how we define and manage network ports, and achieves a bunch of goals in one go, at the cost of being quite big, of course.

## Unix-socket all the things

Restate server now supports listening on unix-sockets on all services (fabric, admin, ingress, and tokio-console). But even better, we listen on *both* the inet socket and unix-socket by default. Unix sockets get automatically created under the `restate-data/*.sock` and get cleaned up on shutdown (even if not, they are cleaned up on the next start). The unix-socket support include restatectl and restate CLIs. `restatectl -s unix:restate-data/admin.sock status` and setting env variables like `RESTATE_ADMIN_URL=unix:restate-data/admin.sock` will work for `restate svc status`. You can also use `curl` to call ingress like `curl --unix-socket restate-data/ingress.sock [http://local/Counter/123/add](http://test/Counter/123/add) --silent --json 1` (note that the hostname in the URL is ignored when connecting with unix-sockets.

Listening on unix-sockets can be disabled on all ports or on certain services, with a new option `listen-mode` that can be supplied in env-variable, config file, or `restate-server --listen-mode=tcp`. Listen modes support `tcp`, `unix`, and `all (default)`.  When using `unix` we’ll only listen on unix-sockets and all advertised addresses will automatically be derived to show the unix-socket address.

As a result of this change, all unit tests now use unix-sockets, no more port conflicts with your locally running services and potentially less flaky tests on CI. I’ve updated (and simplified) the local-cluster-runner utility to make use of it. There is more room for more improvements there still.

## Let’s talk about ports

**Random Ports**

Restate can now select random ports on startup by `restate-server --use-random-ports=true` or `RESTATE_USE_RANDOM_PORTS=true`. Those are conflict-free (os-selected) and because unix-sockets are also created, users (in the future) can use restate/restatectl to by pointing them to the unix-socket until they figure the ports. We print the advertised addresses for all services on startup and with the new `restate-server --no-logo` advertiseds address will be the first thing printed on stdout.

**Socket Activation**

Another cool feature is the support for LISTEN_FD/systemd compatible file-descriptor passing for listener sockets from the parent process to restate-server. A parent process can open the tcp listeners and even the unix-socket listeners (except for fabric port) and pass the file descriptors to restate-server (i.e. via systemd socket activation, or a utility like `systemfd` ).

for instance `systemfd --no-pid -s http::9000 -- restate-server` where `9000` becomes the ingress port. You can pass multiple ports and restate has a certain order to assign those ports.

*What does this bring to the table?*

1. Restarting restate-server without losing the socket listeners (ingress is the biggest winner), so clients will not observe connection errors during restart or upgrades.
2. Test harnesses, wrappers, or even our own tools can pre-allocate the tcp ports, and listen on them before starting restate. Those external wrappers don’t need to wait any more for restate-server to start before they try to connect to it (no connection retries needed). **This unlocks embedding restate in tests or shipping a restate-lite version that only listens on anonymous unix sockets**. In fact, we are a couple of steps away from making it possible to fully embed restate for use-cases that don’t need a server. The only small thing that’s missing is the invoker using a pre-supplied file-descriptor and/or support unix-sockets to connect to deployments.
3. Restate server will now attempt to bind on all required ports and unix-sockets very early in its startup, before starting any roles or opening the database. This reduces the downtime window, and allows us to centralize port assignment (for random) and gives us a nice place to print all addresses.

## Advertised Addresses

The PR unifies how we manage and configure advertised addresses for all services, it deprecates some of the old inconsistently named configuration keys (admin and ingress advertised addresses). But most importantly, restate-server will now attempt to detect a reasonable value for the advertised address. If the restate is listen mode is `unix` (only), it’ll now print `unix:/` advertised addresses, and if it’s tcp, it’ll try and detect the public routable IP address of the node instead of using `127.0.0.1` This makes docker deployments much nicer while maintaining to override all of them as needed. There is also a new option to override the hostname part of this address only without interfering with random ports `RESTATE_ADVERTISED_HOST=my_host.com` (or via cli, config) for global override, and it can be applied per service (`RESTATE_ADMIN__ADVERTISED_HOST=`). In fact, all new options can be overridden per service.

Additionally, all addresses and ports are now managed by a new component `AddressBook` that’s available via task-center. The address book is what powers handing off listeners down to services and it provides an interface to query all bound addresses such that we can return them in future `GetIdent` responses (not implemented yet).

A related improvement is how we configure `metadata-client` ’s `addresses`. Nodes now do not need to supply their own node advertised address in `addresses`. They only need to know about one or more of their peers but we'll now automatically include our own node if it's running a metadata server, thanks to early port binding and the `AddressBook`, this makes `addresses` field in config completely optional and for single-node setups, it's now empty by default.

This opens the door (not implemented) to adding support for `restate-server --peer=<address>` which would allow restate nodes to join a cluster and bootstrap completely by connecting to known peer address. This will let is figure its own correct advertised address, its metadata configuration without passing a configuration file.

## Misc

- New config options (global and with per-service override) `bind-port`, `bind-ip`, `advertised-host`, `use-random-ports` and `listen-mode`
- `restate-hyper-uds` crate to support using unix-socket with hyper clients, we should have a **config-gated** option to allow invoking deployments via unix-sockets too (any takers?)
- Port numbers, unix socket names, and service names are defined in a set of zero-cost types in `restate-types::net::address`
- Type-checked usage of addresses in all the codebase to denote which services they're meant to refer, this avoid confusion where a type like `AdvertisedAddress` didn't make it clear which service. You’ll see types like `AdvertisedAddress<AdminPort>` everywhere now.
- Documentation and configuration json schema express the service name and defaults according to the `ListenerPort` type parameter.

# What did we lose?

- For simplicity and to reduce confusion, unix-socket paths are not configurable anymore through `bind-address`, they will now be always created
under restate_data directory (fabric.sock, ingress.sock, admin.sock, and tokio.sock (if enabled). The socket files are deleted on process shutdown.
The benefit is that their locations are predictable for tools, users, and system operators.
`restatectl -s unix:restate_data/admin.sock status`
- Unix socket names have a limit of ~108 bytes in most unix systems, this puts a limit over the path length of restate-data, I've included a small optimization
that converts the path into relative if CWD is a prefix of the data-dir but this is not guaranteed solution. I'd say we evaluate how much
is this going to be a problem in practice and we can provide a configurable base directory for unix-sockets via config and env variable as needed. It's literally a single variable in AddressBook.
Adds bound_addresses and advertised_addresses to GetIdent responses to enable tools and automations to extract that information for future use.
@AhmedSoliman
Copy link
Contributor Author

Note to reviewers. This is a PR stack. Select the latest commit only to see the changes relevant to this particular PR.

Copy link

github-actions bot commented Oct 16, 2025

Test Results

  7 files  ±0    7 suites  ±0   2m 43s ⏱️ -12s
 49 tests ±0   49 ✅ ±0  0 💤 ±0  0 ❌ ±0 
210 runs  ±0  210 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 6b968f3. ± Comparison against base commit 0dab81b.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@slinkydeveloper slinkydeveloper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the idea, I had something like that in my list for some time already.

I would love to fit this in the bigger picture of the quickstart. For doing that, i would do the following (stuff that i can take over after this PR):

  • Ship this in the templates, as npm run restate-dev i guess should run this restate dev to let you play around.
  • Remove the server download from the quickstart, no need for that anymore!
  • Integrate some form of auto-registration (needs some changes in the SDKs too). The starting point is the template, you write two commands in terminal, one restate up and one npm run dev, poof ready to send requests. In that case, I guess the auto registered counter service maybe is not needed?
  • Change some of the conf defaults to be more suitable for debugging. For example, one thing I've found very useful is increasing to 1day both abort and inactivity timeout, to avoid triggering disconnections/suspensions during debugging sessions.
  • Also logging, I think it does make sense that we show them, but only the ones related to invocation. This is important because it gives users a sense that something is going on. What I think we can do there is simply to tune the default RUST_LOG filters, to show only things we care about.
  • (moving forward) Some basic shortcuts while the dev command is running (like worker dev does), to do common things in development (kill all invocations for example, clear all states).

pub async fn run(State(_env): State<CliEnv>, opts: &Dev) -> Result<()> {
let cancellation = CancellationToken::new();
let temp_dir = tempfile::tempdir()?;
let data_dir = temp_dir.path().to_path_buf();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the IDE plugins i'm always using {project root dir}/.restate/dev-cluster.

Maybe here it makes sense to use {cwd}/.restate/dev-cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this can be an option if we do --retain? I'm not sure if I'd expect us to delete the data on stop from the dot directory.

Copy link
Contributor

@slinkydeveloper slinkydeveloper Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the opposite. By default retain, and if the user wants, clean it up on start using wipe or smth like that. This is what i personally found more useful.

Also temp_dirs might be problematic on some locked down machines (we had this problem with some of our bank customers), so $cwd/.restate/dev-cluster might work better.

Comment on lines 103 to 104
// register mock service
discover_deployment(&admin_uds, format!("http://{mock_svc_addr}/")).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this deployment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic! :)

it's also running in-process on a random port.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes what i meant is where is the code for that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which SDK is that using?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces `restate-lite` a crate that provides restate core functionality as a library. The library is intended to be used in developement or testing use cases. Therefore, it uses defaults tuned for for that purpose.
A developer-focused version of restate that's embedded into the restate CLI. It starts restate on an ephemeral temporary directory that's auto-deleted after Ctrl+C.
1. Supports --use-random-ports
2. Emits very clean output, it doesn't show the server log. Just a table of addresses.
3. Opens the admin UI automatically on startup in the browser
4. Runs the Counter service on a random port and auto-registers it by default so you can play with the UI immediately with that service.
5. Supports --retain to persist the temporary directory (meant to be used in debugging) and currently it doesn't support choosing your own directory
@AhmedSoliman AhmedSoliman changed the title [experimental] restate up command restate up command Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants