-
Notifications
You must be signed in to change notification settings - Fork 204
Description
Describe the enhancement:
When a agent bootstraps fleet-server it connects to fleet-server's "internal server", a dedicated HTTP server running on localhost:8221. This server shares same configuration as the server fleet-server exposes to other agents connect to, including TLS certificates (for one-way and mTLS).
This is somehow troublesome as server TLS certificates should also include the server-names or IPs, which client is expected to validate. Currently the agent skips the server-name verification in order to be able to connect to fleet-server's internal host without and TLS error due to server-name mismatch.
Furthermore, fleet-server's internal server is an implementation detail. Ideally, users should not need to be aware of its existence or configuration. It's already true as the configurations for both servers are shared, but as already stated, it isn't ideal when using TLS. Besides TLS connection errors during the fleet-server bootstrap process are often confusing as they're related to the internal host, which most users aren't aware of.
The agent already generates TLS certificates for communicating with the components (beats) it manages. This same mechanism can be used to generate certificates for the agent-to-fleet-server internal communication.
Additionally, we can enhance security and performance by moving away from localhost and using Unix sockets/named pipes instead. It's important to keep in mind it'll require OS specific configuration.
Open questions:
- Certificate Management: users which need to manage their PKI for compliance or other reasons could also be required to provide the certificates for the internal agent-to-fleet-server communication?
- currently the agent already generates the certificates for the agent-beats(components) communication. However it's possible the agent-beats is treated as a single entity and agent-fleet-server might not, requiring users in highly-regulated environments to provide their own set of certificates.
Describe a specific use case for the enhancement or feature:
Users using mTLS are not aware of the internal server and in the past, due to issues on our side, the mTLS configuration was not valid for the internal server. With this enhancement the internal server would be an implementation detail, not interfering with how users configure fleet-server exposed HTTPS server.
What is the definition of done?
- Successful mTLS Connection: The agent successfully establishes a mTLS connection to the fleet-server's internal endpoint.
- Independent Certificate Generation:
- The agent generates a unique set of TLS certificates specifically for the internal connection, separate from those used for external communication.
- There is an integration test to validate that.
Comprehensive Test Coverage:
- Existing Functionality: All existing integration tests continue to pass, ensuring no regression in functionality.
- Certificate Validation: New integration tests verify that the internal endpoint uses the agent-generated certificates and not the external server's TLS configuration.
- Documentation Update: The documentation is updated to reflect the changes in the agent's internal communication mechanism, including the new certificate handling and user-configurable options if any is added.
- Internal Server Removal (Conditional): If the enhancement includes moving away from localhost (as suggested in your proposal with unix sockets and named pipes), the internal HTTP server on localhost:8221 is removed.
- User-Configured TLS (Conditional): If the enhancement allows users to configure the internal TLS connection, additional integration tests are implemented to cover user-provided certificate scenarios.