rt: provide options to configure unhandled panic behavior

Currently, all panics on tasks are caught and exposed to the user via
`Joinhandle`. However, it is somewhat uncommon to use the `JoinHandle`.
Background tasks are spawned and may silently fail resulting in the rest of the
application to hang. Also, in tests, a background task that panics can result in
the test hanging indefinitely, making debugging annoying.

That said, the current behavior is the correct default. Even if it weren't,
changing it now would be too late. A task boundary is a logical boundary to
separate failure. When implementing a sever, it is not desirable to have an
uncommon bug in one request handler to take down the entire process.

So, because different scenarios merit different behaviors, a runtime
configuration option could provide the user with the ability to pick the
behavior best suited for their case.

There are a few ways panics could be handled:

* Forward to the `JoinHandle` and ignore otherwise (what happens today).
* Forward to the `Joinhandle` but if the `JoinHandle` drops (ignores the result)
  then shutdown the runtime.
* Always shutdown the runtime on panic.
* Pass the panic to a user provided callback to pick which of the above
  strategies to take.

So, to expose the different options to the user:

```rust
#[non_exhaustive]
// TODO: naming?
enum UnhandledPanic {
    Ignore,
    ShutdownRuntime,
    ShutdownRuntimeIfIgnored,
}

type PanicError = Box<dyn Any + Send + 'static>;

impl runtime::Builder {
    fn unhandled_panic_behavior(&mut self, UnhandledPanic) { ... }

    fn on_unhandled_panic(&mut self, f: Fn(PanicError) -> UnhandledPanic) { ... }
}
```

## Runtime shutdown

What does it mean to "shutdown the runtime" on unhandled panic. First, the
current shutdown behavior is executed. All in-flight tasks are forcibly aborted
and runtime resources are disabled. The next question is how to expose the
unhandled panic.

If the user enables "shutdown runtime on unhandled panic" and a panic does get
through, it seems likely that this is a bug. The `Runtime` methods in question
are:

* `spawn`
* `block_on`

`spawn` could maintain the current behavior when called after a runtime has
shutdown: immediately drop the task and complete the `JoinHandle` with an error.
The `block_on` method does not return result. The only option I see is for it to
panic when the runtime has seen an unhandled panic.

To compensate, we could add methods on `Runtime` to query the runtime state,
e.g. `Runtime::status() -> Running | Shutdown | UnhandledPanic | ...`

## Initial implementation

As an initial step to get the feature going. I suggest implementing an MVP
version of the feature as an unstable API and only for the `current_thread`
runtime. This would let us explore the space more and try things out. The
initial implementation could also start by only letting the user pick between
the current behavior and `ShutdownRuntime`. So:

```rust
#[non_exhaustive]
enum UnhandledPanic {
    Ignore,
    ShutdownRuntime,
}

type PanicError = Box<dyn Any + Send + 'static>;

impl runtime::Builder {
    fn unhandled_panic_behavior(&mut self, UnhandledPanic) { ... }
}
```

When the multi-threaded runtime is selected, these option would have no effect.
Implementing for the multi-threaded runtime would be required before stabilizing
the API but because the implementation is much harder, we should first gather data.

## Open questions

* [ ] How should unhandled panics be propagated? Should they be sent to `block_on` or the `JoinHandle` (ref: https://github.com/tokio-rs/tokio/issues/4516).
* [ ] How should `LocalSet` and `JoinSet` work. Should they track their own settings or inherit from the runtime? Should there be a `LocalSet::builder()`?

## Known issues
* [ ] Switching the "current' scheduler context then panicking (https://github.com/tokio-rs/tokio/pull/4765#discussion_r910736494). In this case, "current" does not reference the runtime that should intercept the panic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

rt: provide options to configure unhandled panic behavior #4516

Runtime shutdown

Initial implementation

Open questions

Known issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

rt: provide options to configure unhandled panic behavior #4516

Description

Runtime shutdown

Initial implementation

Open questions

Known issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions