Network.Transport.Management

_From @hyperthunk on December 18, 2012 13:48_

With any external resource, chances are things are going to go wrong sometimes. Following on from issue #98 there are various situations which might require manual intervention. Shutting down a bundle of connections/channels between two specific endpoints, forcibly terminating an endpoint or even shutting down a whole transport might be required if connectivity drops in such a way that things get stuck.

Having an API for this would enable us to write tools (command line, web based, etc) to assist with manual administration of a distributed system. The API needs to support the primary use case where an administrator connects to a running node from an external location and can take actions from there.

There needs to be some _entry point_ to which the external client connects, and of course there is no such concept in `Network.Transport`. I can see two ways to go about this, though there may be other possibilities too.
1. couple the functionality with the node controller
2. provide a service registry as part of `Network.Transport` itself

The point here is that in any running executable, we need some means by which we can connect in order to query for this information. As the node controller already provides this, it seems a sensible choice at first glance. The node controller is initialised with a `Transport` so it can use the `Network.Transport` APIs to handle requested interactions. 

So does it make sense to force all the interactions to go through the node controller? Another alternative would be to have a registered _service process_ that gets booted with each node controller, and use `nsend` to talk to this process instead. Either way, the API data needs to reside in core CH so that the nodes can communicate effectively without sharing the same image.

Providing some kind of service registry for the `Network.Transport` itself is probably wrong. We'd need to provide an access point to the outside world and it seems crazy not to use the node controller(s) for this. I suppose one way of doing this would be to have the backends open up an additional management port and use a separate control channel for _management messages_ - not sure what I think about forcing that on all backends though, and as @edsko mentioned elsewhere we're trying to keep actual functionality out of the `Network.Transport` layer and push it to the implementations. Forcing each implementation to write code to handle management requests seems wrong.

One _problem_ with using the node controller as the entry point for a management (and/or stats gathering) API is that you need to know which backend is in use. As an administrator I guess you should know that anyway, so maybe it's not a problem.

I also think that we should put a secure HTTP based API around this, so that you can open up the management capabilities without having to make connectivity possible. For example, you might not _want_ to expose the node outside your LAN, but allow administration to take place over the internet providing TLS is in play. _That_ probably belongs either in a separate top-level project, or in -platform, possibly bundled with other functionality into a single management web interface.

_Copied from original issue: haskell-distributed/distributed-process#99_


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Network.Transport.Management #412

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Network.Transport.Management #412

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions