Skip to content

Commit b918d59

Browse files
committed
Reply to comments
Signed-off-by: Henrique Spanoudis Matulis <[email protected]>
1 parent 0864090 commit b918d59

File tree

1 file changed

+111
-17
lines changed

1 file changed

+111
-17
lines changed

proposals/2025-02-24_secret_providers.md

+111-17
Original file line numberDiff line numberDiff line change
@@ -24,15 +24,15 @@
2424
2525
## Why
2626

27-
The motivation behind this design document is to enhance the security and flexibility of secret management in Prometheus. Currently, Prometheus only supports reading secrets from the filesystem or directly from the configuration file, which can lead to security vulnerabilities and limitations when working with certain service providers.
27+
The motivation behind this design document is to enhance the flexibility of secret management in Prometheus. Currently, Prometheus only supports reading secrets from the filesystem or directly from the configuration file, which can be cumbersome when running it in certain enviroments or when frequent secret rotations are needed.
2828

29-
This proposal introduces secret discovery, similar to service discovery, where different secret providers can contribute code to read secrets from their respective APIs. This would allow for more secure and dynamic secret retrieval, eliminating the need to store secrets in the filesystem and reducing the potential for unauthorized access.
29+
This proposal introduces secret discovery, similar to service discovery, where different secret providers can contribute code to read secrets from their respective APIs. This would allow for more dynamic secret retrieval, eliminating the need to store secrets in the filesystem and simplifying the user experience.
3030

3131
### Pitfalls of the current solution
3232

33-
Storing secrets in the filesystem poses risks, especially in environments like Kubernetes, where any pod on a node can access files mounted on that node. This could expose secrets to attackers. Additionally, configuring secrets through the filesystem often requires extra setup steps in some environments, which can be cumbersome for users.
33+
In certain enviroments, configuring secrets through the filesystem often requires extra setup steps or it might not even be possible. This can be cumbersome for users.
3434

35-
Storing secrets inline can also pose risks, as the configuration file may still be accessible through the filesystem. Additionally it can lead to configuration files becoming cluttered and difficult to manage.
35+
Storing secrets inline is always possible, but it can lead to configuration files becoming cluttered and difficult to manage. Additionally rotating inline secrets will be more troublesome.
3636

3737
## Goals
3838

@@ -69,19 +69,20 @@ Wherever a `<secret>` type is present in the configuration files, we will allow
6969

7070
```
7171
secret_field:
72-
provider: <type of the provider>
73-
<property1>: <value1>
74-
...
75-
<propertyN>: <valueN>
72+
<type of the provider>:
73+
<property1>: <value1>
74+
...
75+
<propertyN>: <valueN>
7676
```
7777

7878
For example when specifying a password fetched from the kubernetes provider with an id of `pass2` in namespace `ns1` for the HTTP passsword field it would look like this:
7979

8080
```
8181
password:
82-
provider: kubernetes
83-
namespace: ns1
84-
secret_id: pass2
82+
kubernetes:
83+
namespace: <ns>
84+
name: <secret name>
85+
key: <data's key for secret name>
8586
```
8687

8788
### Inline secrets
@@ -99,6 +100,8 @@ The first case is that we have not gotten any secret value since startup. In thi
99100

100101
The second case is that we already have a secret value, but refreshing it has resulted in an error. In this case we should keep the component that uses this secret running with the potentially stale secret, and schedule a retry.
101102

103+
If we are starting up prometheus and do not get any secret values, startup will continue and prometheus will continue to run and retry finding secrets. If there are components that are fully specified, they will run during this time. The idea is that it is better to send partial metrics than no metrics, and the emitted metrics for secrets can alert users if there is a problem with their service providers.
104+
102105

103106
### Secret rotation
104107

@@ -129,7 +132,7 @@ A state enum describing in which error condition the secret is in:
129132
```
130133
# HELP prometheus_remote_secret_state Describes the current state of a remotely fetched secret.
131134
# TYPE prometheus_remote_secret_state gauge
132-
prometheus_remote_secret_state{provider="kubernetes", secret_id="pass1", state="none"} 0
135+
prometheus_remote_secret_state{provider="kubernetes", secret_id="pass1", state="success"} 0
133136
prometheus_remote_secret_state{provider="kubernetes", secret_id="auth_token", state="stale"} 1
134137
prometheus_remote_secret_state{id="myk8secrets", secret_id="pass2", state="error"} 2
135138
```
@@ -141,19 +144,110 @@ Secret providers might require secrets to be configured themselves. We will allo
141144
```
142145
...
143146
password:
144-
provider: bootstrapped
145-
secret_id: pass1
146-
auth_token:
147-
provider: kubernetes
148-
secret_id: auth_token
147+
bootstrapped:
148+
secret_id: pass1
149+
auth_token:
150+
kubernetes:
151+
name: <secret name>
152+
key: <data's key for secret name>
149153
```
150154

155+
Note that there is a 'chicken and egg' problem here, where you need to have credentials to access the secret provider itself. Normally this bootstrapping would be done through inline or filesystem secrets. For cloud enviroments, there is usually an identity associated with the machine in the enviroment that can be used. However, in both cases this type of 'bootstrapping' doesn't really increase security as you should already have access to the underlying secrets. Our goal here is just to decrease toil.
156+
151157
However, an initial implementation might only allow inline secrets for secret providers. This might limit the usefulness of certain providers that require sensitive information for their own configuration.
152158

153159
### Where will code live
154160

155161
Both the Alertmanager and Prometheus repos will be able to use secret providers. The code will eventually live in a separete repository specifically created for it.
156162

163+
## Open questions
164+
165+
* What is the process for creating a secret provider implementation?
166+
* How can we prevent too many dependencies from getting pulled in from different providers?
167+
168+
## Secret provider interfaces in the wild
169+
170+
A summary of popular secret providers
171+
### [HashiCorp Vault](https://developer.hashicorp.com/hcp/docs/vault-secrets)
172+
173+
```
174+
// Step 1: Authenticate with the Vault server
175+
// This typically involves providing credentials or using an authentication method (e.g., token, AppRole)
176+
authentication_response = authenticate_with_vault(authentication_method, credentials)
177+
auth_token = authentication_response.token
178+
179+
// Step 2: Specify the path to the secret you want to retrieve
180+
secret_path = "secret/data/my_application/database_credentials"
181+
182+
// Step 3: Read the secret data from the specified path using the auth token
183+
secret_data_response = read_vault_secret(secret_path, auth_token)
184+
```
185+
186+
### [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html)
187+
188+
```
189+
// Step 1: Configure AWS credentials and region
190+
// This is typically done via environment variables, IAM roles, or config files
191+
configure_aws_sdk()
192+
193+
// Step 2: Specify the name or ARN of the secret
194+
secret_name = "my/application/database_secret"
195+
196+
// Step 3: Retrieve the secret value from AWS Secrets Manager
197+
// The secret value is returned as a string, often containing JSON
198+
secret_value_response = get_aws_secret_value(secret_name)
199+
secret_password = secret_value_response.secret_string
200+
```
201+
202+
### [Azure Key Vault](https://learn.microsoft.com/en-us/azure/key-vault/general/overview)
203+
204+
```
205+
// Step 1: Authenticate with Azure Active Directory
206+
// This is often done using Managed Identity or a Service Principal
207+
authentication_client = create_azure_identity_client()
208+
credential = authentication_client.get_credential()
209+
210+
// Step 2: Create a Key Vault client
211+
key_vault_url = "https://my-key-vault-name.vault.azure.net/"
212+
key_vault_client = create_key_vault_secret_client(key_vault_url, credential)
213+
214+
// Step 3: Specify the name of the secret
215+
secret_name = "DatabasePassword"
216+
217+
// Step 4: Retrieve the secret
218+
secret = key_vault_client.get_secret(secret_name)
219+
```
220+
221+
### [Google Secret Manager](https://cloud.google.com/secret-manager/docs)
222+
223+
```
224+
// Step 1: Authenticate with Google Cloud
225+
// This is typically handled by the client library using environment variables or service account keys
226+
secret_manager_client = create_google_secret_manager_client()
227+
228+
// Step 2: Specify the secret name and version
229+
// Format: projects/PROJECT_ID/secrets/SECRET_NAME/versions/VERSION_ID (use 'latest' for the current version)
230+
secret_version_name = "projects/my-gcp-project/secrets/my-database-secret/versions/latest"
231+
232+
// Step 3: Access the specified secret version
233+
response = secret_manager_client.access_secret_version(secret_version_name)
234+
```
235+
236+
### [Kubernetes Secrets](https://kubernetes.io/docs/concepts/configuration/secret/)
237+
238+
```
239+
// Step 1: Configure Kubernetes client
240+
// This typically involves loading the kubeconfig file or using in-cluster configuration
241+
kubernetes_client = configure_kubernetes_client()
242+
243+
// Step 2: Specify the namespace and name of the secret
244+
secret_namespace = "my-application-namespace"
245+
secret_name = "my-database-secret"
246+
247+
// Step 3: Retrieve the secret object from the Kubernetes API
248+
secret_object = get_kubernetes_secret(secret_namespace, secret_name, kubernetes_client)
249+
```
250+
157251
## Action Plan
158252

159253
* [ ] Create action plan after doc is stable!

0 commit comments

Comments
 (0)