Skip to content

[FEATURE] - Improve Actor Supervision #275

@Ruhrpottpatriot

Description

@Ruhrpottpatriot

Feature Description

Currently we can link actors either on spawn or after creation. However, this is only a sibling mechanic where actors notify their peers if they have died. For a proper supervision, there should also be a parent/child relation. This would allow users to have systems where a single supervisor creates and handles multiple children that each must be alive for the whole application to work properly. If such a child actor dies then the supervisor could simply restart/reset the child actor to a known good state and resume operation.
This would also allow the user to easily get siblings under the same supervisor and pass messages between them.

Proposed Solution

Currently the link function simply stores the references to the sibling on the actors so that each gets notified when one dies. I propose extending that API slightly by introducing a "RelationshipType" enum, with the Variants "Parent", "Sibling" and "Child". In addition there should be the following additional methods on the ActorRef type:

/// Gets all siblings for the current actor
///
/// The sibling of an actor is every other actors that was linked against the current actor 
/// and every actor that has the same parent as this actor. Siblings might not exist anymore,
/// therefore this method only returns `WeakActorRefs`, which the caller must upgrade first.
fn get_siblings(&self) -> Option<Vec<WeakActorRef<Actor>>>`;

/// Gets the parent of the current actor, if it exists.
///
/// Since parents and children are in a hierarchical relationship, the parent actor
/// **must** exist for the child to exist, therefore it's safe to return a normal `ActorRef` here.
fn get_parent(&self) -> Option<ActorRef<Actor>>;

/// Gets all the children that are registered with this actor.
///
/// Usually a parent wants their children to be alive, but depending on the use-case a parent 
/// can work with one or more children being dead, therefore we only return a list of `WeakActorRef`.
fn get_children(&self) -> Option<Vec<WeakActorRef<Actor>>>;

In addition to this there would be two new methods to register links:

/// Registers the actor as a child of the current actor, also registers the current actor as parent on the child.
fn link_child<B: Actor>(&self, child_ref: &ActorRef<B>);

/// Registers the actor as a parent of the current actor, also registers the current actor as a child on the parent.
fn link_parent<B: Actor>(&self, parent_ref: &ActorRef<B>)

This approach would also necessitate slight changes to the on_link_died logic. Currently the default behaviour is to kill all links, unless a link died a "normal" death. This is sensible for sibling reactions. However, it's not the correct approach for a hierarchical relationship, where the parent/supervisor could restart the child and only die if restarting is impossible. I propose keeping the current approach for sibling relations and ERLANG's supervisor approach for parent/child relations. This would give us:

  • one_for_one: When a child process dies then the parent tries to restart only the child that died.
  • one_for_all: When any child process dies then the parent restarts all children.
  • rest_for_one: When a child dies, then the parent tries to restart all children than were created after the child that died.

Naturally this approach needs logic to handle cases where the parent tries to restart one or multiple children and restarting fails, or if the children exit normally. Again, ERLANG has a neat shutdown soltion:

  • never: The parent never dies, even when all children are dead.
  • any_significant: The parent dies when any child that was marked as significant during linking died.
  • all_significant: The parent dies when all children that were marked as significant during linking died.

Children always die when the parent dies.

The above strategy has the benefit of handing restart faults gracefully and for no cost it also gives us a simple way to clean up an entire supervision tree by simply shutting down the root parent.

ERLANG's supervision strategy has more to offer, e.g. children that are temporary (never restarts) or transient (only restarts after abnormal death), but at the moment I think this would be a good starting point.

Alternatives Considered

Currently no alternatives, except manual implementation of the above.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions