Description
From @hyperthunk on December 18, 2012 13:48
With any external resource, chances are things are going to go wrong sometimes. Following on from issue #98 there are various situations which might require manual intervention. Shutting down a bundle of connections/channels between two specific endpoints, forcibly terminating an endpoint or even shutting down a whole transport might be required if connectivity drops in such a way that things get stuck.
Having an API for this would enable us to write tools (command line, web based, etc) to assist with manual administration of a distributed system. The API needs to support the primary use case where an administrator connects to a running node from an external location and can take actions from there.
There needs to be some entry point to which the external client connects, and of course there is no such concept in Network.Transport
. I can see two ways to go about this, though there may be other possibilities too.
- couple the functionality with the node controller
- provide a service registry as part of
Network.Transport
itself
The point here is that in any running executable, we need some means by which we can connect in order to query for this information. As the node controller already provides this, it seems a sensible choice at first glance. The node controller is initialised with a Transport
so it can use the Network.Transport
APIs to handle requested interactions.
So does it make sense to force all the interactions to go through the node controller? Another alternative would be to have a registered service process that gets booted with each node controller, and use nsend
to talk to this process instead. Either way, the API data needs to reside in core CH so that the nodes can communicate effectively without sharing the same image.
Providing some kind of service registry for the Network.Transport
itself is probably wrong. We'd need to provide an access point to the outside world and it seems crazy not to use the node controller(s) for this. I suppose one way of doing this would be to have the backends open up an additional management port and use a separate control channel for management messages - not sure what I think about forcing that on all backends though, and as @edsko mentioned elsewhere we're trying to keep actual functionality out of the Network.Transport
layer and push it to the implementations. Forcing each implementation to write code to handle management requests seems wrong.
One problem with using the node controller as the entry point for a management (and/or stats gathering) API is that you need to know which backend is in use. As an administrator I guess you should know that anyway, so maybe it's not a problem.
I also think that we should put a secure HTTP based API around this, so that you can open up the management capabilities without having to make connectivity possible. For example, you might not want to expose the node outside your LAN, but allow administration to take place over the internet providing TLS is in play. That probably belongs either in a separate top-level project, or in -platform, possibly bundled with other functionality into a single management web interface.
Copied from original issue: #99