Skip to content

Conversation

@esposem
Copy link
Contributor

@esposem esposem commented Sep 12, 2025

In operator-based deployments, kbs-client (and in general the admin API) should not be allowed to try and modify the resources, for two reasons:

1. trustee runs in a container so the fs is r/o
2. management of such resources is by design delegated to kubernetes configmaps

Therefore add admin_api_ro flags defaulting to false under the [admin] section of kbs-config.toml.

@esposem esposem requested a review from a team as a code owner September 12, 2025 07:18
@esposem
Copy link
Contributor Author

esposem commented Sep 12, 2025

@lmilleri @fitzthum FYI

@Xynnn007
Copy link
Member

I think this matches what @mkulke mentioned in #694 (comment) and I agree with this in short term.

In a relative mid/long term, I am thinking to gradually deprecate the fs backend and add a db backend. At that time, we would need to remove this admin_api_ro flag again. Thus, personally, I am in favor of this PR.

Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thansk! I think this flag is necessary feature atm. a couple of questions on cosmetics and semantics:

  • should we call the toggle and variable/field in the code admin_api_read_only? ro is not very grepable, and it's unidiomatic in rust to sacrifice characters for clarity.
  • Is it only the admin_api or the whole state of the service that is read_only? because only the admin can change state, we can drop the admin_api_... prefix.
  • it looks like we apply the restriction to POST methods. are we using PUT/DELETE/PATCH to alter the state? maybe we can turn it around and assert GET | HEAD when the read_only toggle is set? (and hope that no one in a plugin corner performs state changes with GET 🤞)
  • AFAIU, auth_public_key=Some(...) would be overridden by read_only=true. Can we change the internal representation (not the user toggles) of AdminConfig to reflect that by using an auth enum:
enum AdminAuth {
   None,
   PubKey(PathBuf),
   ReadOnly,
}

pub struct AdminConfig {
   pub insecure_api: bool,
   pub auth: AdminAuth,
}

@esposem
Copy link
Contributor Author

esposem commented Sep 12, 2025

Thanks for the review!

* should we call the toggle and variable/field in the code `admin_api_read_only`? `ro` is not very grepable, and it's unidiomatic in rust to sacrifice characters for clarity.

Ok makes sense

* Is it only the admin_api or the whole state of the service that is read_only? because only the admin can change state, we can drop the `admin_api_...` prefix.

Well this flags blocks all the admin API except GET or at least the one that call for admin verification first. I am not sure, should I remove the admin_api?

* it looks like we apply the restriction to POST methods. are we using PUT/DELETE/PATCH to alter the state? maybe we can turn it around and assert GET | HEAD when the read_only toggle is set? (and hope that no one in a plugin corner performs state changes with GET 🤞)

I am not seeing anything other than GET and POST methods in api_servers.rs. What I can do is check that the method() is != GET to refuse the call.

* AFAIU, auth_public_key=Some(...) would be overridden by read_only=true. Can we change the internal representation (not the user toggles) of AdminConfig to reflect that by using an auth enum:

Will do

enum AdminAuth {
   None,
   PubKey(PathBuf),
   ReadOnly,
}

pub struct AdminConfig {
   pub insecure_api: bool,
   pub auth: AdminAuth,
}

@esposem
Copy link
Contributor Author

esposem commented Sep 12, 2025

AFAIU, auth_public_key=Some(...) would be overridden by read_only=true. Can we change the internal representation (not the user toggles) of AdminConfig to reflect that by using an auth enum:

Actually, no I don't think this fits in here. I am not blocking all the APIs (I could do that if we decide to do so) but the idea is that the kbs-client is still able to fetch resources, if for whatever reasons it still needs to do that. Therefore the public key and insecure_admin are still valid flags, but if this new flag is enabled, they will be restricted to only do GET methods.
But we can still restrict only the kbs-client with the right key to perform such operations, thus we need to leave these two flags and not get replaced by the read_only one.

@esposem
Copy link
Contributor Author

esposem commented Sep 12, 2025

I pushed a new version applying almost all changes except

  • admin_api prefix is still there
  • no enum as I explained in the previous comment, I don't think it applies

@mkulke
Copy link
Contributor

mkulke commented Sep 12, 2025

AFAIU, auth_public_key=Some(...) would be overridden by read_only=true. Can we change the internal representation (not the user toggles) of AdminConfig to reflect that by using an auth enum:

Actually, no I don't think this fits in here. I am not blocking all the APIs (I could do that if we decide to do so) but the idea is that the kbs-client is still able to fetch resources, if for whatever reasons it still needs to do that. Therefore the public key and insecure_admin are still valid flags, but if this new flag is enabled, they will be restricted to only do GET methods. But we can still restrict only the kbs-client with the right key to perform such operations, thus we need to leave these two flags and not get replaced by the read_only one.

I see. if it is an expected combination to have both admin_pub_key set and read_only=true, then we need both.

Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, maybe run the formatter to satisfy the ci

@esposem
Copy link
Contributor Author

esposem commented Sep 12, 2025

The question is... do we need the kbs to be able to still fetch (GET) stuff from the trustee? What would be the use case? Otherwise I can simply block completely the api admins, meaning tools like kbs-client get completely useless in such case.
Alternatively we can have 2 flags
disable_admin_reads
disable_admin_writes
What do you think?

@mkulke
Copy link
Contributor

mkulke commented Sep 12, 2025

The question is... do we need the kbs to be able to still fetch (GET) stuff from the trustee? What would be the use case? Otherwise I can simply block completely the api admins, meaning tools like kbs-client get completely useless in such case. Alternatively we can have 2 flags disable_admin_reads disable_admin_writes What do you think?

I suppose that's a different question. since the authN/Z framework will be reworked anyway, I would just stick to the read_only flag for now to avoid broken states like a brain-split policy.

@esposem
Copy link
Contributor Author

esposem commented Sep 12, 2025

ok so let's leave it as it is, let me apply the HEAD and formatting fixes and we should be done

@esposem esposem force-pushed the admin_api_ro branch 2 times, most recently from b7884c1 to bcc2ed5 Compare September 12, 2025 11:22
@fitzthum
Copy link
Member

This seems fine, but I am still skeptical about some of the things the operator is doing. The main area of concern is that setting reference values via the operator circumvents the extractors built into the RVPS. Technically this PR doesn't really affect that because when we set reference values via the KBS admin api, we can only use the sample extractor anyway. The other extractors are accessed through the rvps endpoint using the rvps-tool.

Either way, we should think if it really makes sense to have two ways to do everything, and how the operator functionality will be reconciled with upcoming Trustee-in-coco work.

@esposem esposem force-pushed the admin_api_ro branch 3 times, most recently from be159ca to 61c4a3a Compare September 15, 2025 07:39
In operator-based deployments, kbs-client (and in general the admin API)
should not be allowed to try and modify the resources, for two reasons:

1. trustee runs in a container so the fs is r/o
2. management of such resources is by design delegated to kubernetes
   configmaps

Therefore add admin_api_ro flags defaulting to false under the [admin]
section of kbs-config.toml.

Signed-off-by: Emanuele Giuseppe Esposito <[email protected]>
@esposem
Copy link
Contributor Author

esposem commented Sep 15, 2025

The main area of concern is that setting reference values via the operator circumvents the extractors built into the RVPS. Technically this PR doesn't really affect that because when we set reference values via the KBS admin api, we can only use the sample extractor anyway. The other extractors are accessed through the rvps endpoint using the rvps-tool.

So I did my own little research and I think I understand what you are saying: basically the storage way to upload a rsvp via kbs-config.toml bypasses all security checks. IMHO there are few points:

  1. rsvp-tool and alike won't really make sense in kubernetes, because I believe there shouldn't be any permission/restriction difference from an admin being able to load the rvps via configmap or using a tool like rvps-tools running as a container probably with a configmap attached to it. So what do we actually verify if the rsvp-tool pod and the trustee pod are run by the same admin with same privilege? Unless we want to allow only some admins within the cluster to be able to set/update rsvps? Trustee runs in a secure cluster so anyone with admin access is trusted, and I guess having a tool to change the rsvp just adds another attack vector
  2. probably I should extend also the rsvp apis to support r/o, as extending/updating the rsvp in any other way other than configmap doesn't really make sense

in a non-operator deployment rsvp tools make complete sense, but I am not totally convinced it could work in an operator environment

@mkulke
Copy link
Contributor

mkulke commented Sep 15, 2025

basically the storage way to upload a rsvp via kbs-config.toml bypasses all security checks

I don't think this is an argument about security. it's rather about processing-functionality available in the rvps-tool that is absent when using the trustee operator.

(the overarching question, if I understood it correctly, would be whether it's desirable at all to use the trustee operator for deployment and management. see here. my take: afaik the operator is currently the closest to anything resembling a production deployment, since authn/z, state and other non-functional aspects are still WIP in trustee itsself)

@esposem
Copy link
Contributor Author

esposem commented Sep 15, 2025

Yes but so as the other issue mentions, we should clarify clear paths and usages for each use case. I was even thinking to enforce these flags into the operator repo, for example enforce admin_api_read_only=true, meaning it won't even start if the flag is missing.

@fitzthum
Copy link
Member

since authn/z, state and other non-functional aspects are still WIP in trustee itsself

This is a good thing to mention. We have some plans to add a more sophisticated admin framework to Trustee (see #926). This would be bypassed by the operator. This would be much more significant duplication than the RVPS stuff.

@fitzthum
Copy link
Member

I don't think this is an argument about security. it's rather about processing-functionality available in the rvps-tool that is absent when using the trustee operator.

Yes. When using the operator you load RVs via a config map. This bypasses the functionality of the extractors. I guess you could create some workaround where there is an additional container running the rvps tool that reads from the config map and adds stuff to the RVPS, but that is kind of weird.

@esposem
Copy link
Contributor Author

esposem commented Sep 16, 2025

FWIW I added another flag for the rvps standalone server too, but didn't document it in kbs as it doesn't make sense (rvps running in kbs is only accessible via admin apis as far as I understood). This should cover also the rvps-tool case.

@esposem esposem force-pushed the admin_api_ro branch 3 times, most recently from 1262644 to b87a2d3 Compare September 16, 2025 08:45
Introduce rvps read_only flag, that forces the rvps server to only allow
queries and not register new reference values.

This is useful for operator/kubernetes/container deployment where the
rvps is run as standalone and therefore not configured via
kbs-config.toml.

Signed-off-by: Emanuele Giuseppe Esposito <[email protected]>
@fitzthum
Copy link
Member

Btw @esposem let me clarify something. Is it all operator-based deployments that don't support writing to the filesystem or is that just on OCP?

@lmilleri
Copy link
Member

Btw @esposem let me clarify something. Is it all operator-based deployments that don't support writing to the filesystem or is that just on OCP?

@fitzthum I think the rationale of having this PR is to make sure that in operator-based deployments, the trustee instance is enforced to be (almost) stateless. This should help the trustee HA use case (every trustee instance must have the same file system content - policies, ref-values, etc). This works when using k8s configmaps and secrets for modifications.
Btw maybe this is just the first iteration, considering the future evolution of trustee, e.g. fs -> db, etc

@fitzthum
Copy link
Member

fitzthum commented Sep 26, 2025

I think we should try to move to a database backend ASAP. Using a filesystem for HA means that Trustee can't write anything and policies, resources, and RVs have to be provisioned via the control plane. This is really at odds with how the Trustee admin interfaces work and won't work for Trustee in CoCo.

If we use a database, you can still keep your config map logic in place as an option. It will just need to be implemented via a KBS-Client container or a database client.

@fitzthum
Copy link
Member

fitzthum commented Nov 3, 2025

This is sort of superseded by #1014 which can disable all the admin endpoints. That's not quite the same as making admin stuff readonly, but hopefully we are going to move in another direction there anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants