You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: proposals/2025-02-24_secret_providers.md
+111-17
Original file line number
Diff line number
Diff line change
@@ -24,15 +24,15 @@
24
24
25
25
## Why
26
26
27
-
The motivation behind this design document is to enhance the security and flexibility of secret management in Prometheus. Currently, Prometheus only supports reading secrets from the filesystem or directly from the configuration file, which can lead to security vulnerabilities and limitations when working with certain service providers.
27
+
The motivation behind this design document is to enhance the flexibility of secret management in Prometheus. Currently, Prometheus only supports reading secrets from the filesystem or directly from the configuration file, which can be cumbersome when running it in certain enviroments or when frequent secret rotations are needed.
28
28
29
-
This proposal introduces secret discovery, similar to service discovery, where different secret providers can contribute code to read secrets from their respective APIs. This would allow for more secure and dynamic secret retrieval, eliminating the need to store secrets in the filesystem and reducing the potential for unauthorized access.
29
+
This proposal introduces secret discovery, similar to service discovery, where different secret providers can contribute code to read secrets from their respective APIs. This would allow for more dynamic secret retrieval, eliminating the need to store secrets in the filesystem and simplifying the user experience.
30
30
31
31
### Pitfalls of the current solution
32
32
33
-
Storing secrets in the filesystem poses risks, especially in environments like Kubernetes, where any pod on a node can access files mounted on that node. This could expose secrets to attackers. Additionally, configuring secrets through the filesystem often requires extra setup steps in some environments, which can be cumbersome for users.
33
+
In certain enviroments, configuring secrets through the filesystem often requires extra setup steps or it might not even be possible. This can be cumbersome for users.
34
34
35
-
Storing secrets inline can also pose risks, as the configuration file may still be accessible through the filesystem. Additionally it can lead to configuration files becoming cluttered and difficult to manage.
35
+
Storing secrets inline is always possible, but it can lead to configuration files becoming cluttered and difficult to manage. Additionally rotating inline secrets will be more troublesome.
36
36
37
37
## Goals
38
38
@@ -69,19 +69,20 @@ Wherever a `<secret>` type is present in the configuration files, we will allow
69
69
70
70
```
71
71
secret_field:
72
-
provider: <type of the provider>
73
-
<property1>: <value1>
74
-
...
75
-
<propertyN>: <valueN>
72
+
<type of the provider>:
73
+
<property1>: <value1>
74
+
...
75
+
<propertyN>: <valueN>
76
76
```
77
77
78
78
For example when specifying a password fetched from the kubernetes provider with an id of `pass2` in namespace `ns1` for the HTTP passsword field it would look like this:
79
79
80
80
```
81
81
password:
82
-
provider: kubernetes
83
-
namespace: ns1
84
-
secret_id: pass2
82
+
kubernetes:
83
+
namespace: <ns>
84
+
name: <secret name>
85
+
key: <data's key for secret name>
85
86
```
86
87
87
88
### Inline secrets
@@ -99,6 +100,8 @@ The first case is that we have not gotten any secret value since startup. In thi
99
100
100
101
The second case is that we already have a secret value, but refreshing it has resulted in an error. In this case we should keep the component that uses this secret running with the potentially stale secret, and schedule a retry.
101
102
103
+
If we are starting up prometheus and do not get any secret values, startup will continue and prometheus will continue to run and retry finding secrets. If there are components that are fully specified, they will run during this time. The idea is that it is better to send partial metrics than no metrics, and the emitted metrics for secrets can alert users if there is a problem with their service providers.
104
+
102
105
103
106
### Secret rotation
104
107
@@ -129,7 +132,7 @@ A state enum describing in which error condition the secret is in:
129
132
```
130
133
# HELP prometheus_remote_secret_state Describes the current state of a remotely fetched secret.
@@ -141,19 +144,110 @@ Secret providers might require secrets to be configured themselves. We will allo
141
144
```
142
145
...
143
146
password:
144
-
provider: bootstrapped
145
-
secret_id: pass1
146
-
auth_token:
147
-
provider: kubernetes
148
-
secret_id: auth_token
147
+
bootstrapped:
148
+
secret_id: pass1
149
+
auth_token:
150
+
kubernetes:
151
+
name: <secret name>
152
+
key: <data's key for secret name>
149
153
```
150
154
155
+
Note that there is a 'chicken and egg' problem here, where you need to have credentials to access the secret provider itself. Normally this bootstrapping would be done through inline or filesystem secrets. For cloud enviroments, there is usually an identity associated with the machine in the enviroment that can be used. However, in both cases this type of 'bootstrapping' doesn't really increase security as you should already have access to the underlying secrets. Our goal here is just to decrease toil.
156
+
151
157
However, an initial implementation might only allow inline secrets for secret providers. This might limit the usefulness of certain providers that require sensitive information for their own configuration.
152
158
153
159
### Where will code live
154
160
155
161
Both the Alertmanager and Prometheus repos will be able to use secret providers. The code will eventually live in a separete repository specifically created for it.
156
162
163
+
## Open questions
164
+
165
+
* What is the process for creating a secret provider implementation?
166
+
* How can we prevent too many dependencies from getting pulled in from different providers?
0 commit comments