-
Notifications
You must be signed in to change notification settings - Fork 142
Key-Value Storage and Policy Engine changes for Trustee Stateless #1156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Key-Value Storage and Policy Engine changes for Trustee Stateless #1156
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. A couple comments. I guess the main thing is about the instance abstraction and whether we really need it.
| } | ||
|
|
||
| impl ShimConfig { | ||
| pub fn to_instance_config(&self, instance: &str) -> Config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess one of the key ideas here is to have the shim vs the instance. I don't think these terms are very clear, but my fundamental question is why not just set the instance at runtime. As in when we make a request to the kv store, we could provide the "instance." Then we wouldn't need any shim config. We would just use the various config structs directly (but remove the instance field from them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a thread.
In short, my original PR version allowed precise configuration of the "instance" for each storage and each place. Following @mkulke's suggestion, I hardcoded the instance name. This way, users only need to configure one backend storage, and the subsequent instances are automatically generated, eliminating complex maintenance. The benefits of sacrificing this configuration flexibility could outweigh its drawbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this seems like a more complicated way to have something that is less powerful. I guess I am fine with it if you can find a better name for the ShimConfig or figure out way to not need that struct at all (like an optional field or something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact We can fully get rid of original Config structs and rename the ShimConfigs to Configs, and change the new function like
impl PostgresClient {
pub async fn new(config: Config, namespace: String) -> Result<Self> {
...
}where the namespace is table for postgres, and the new Config is the current ShimConfig which does not have a table field. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me make a separate commit for this for a view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub storage_type: KeyValueStorageType, | ||
|
|
||
| /// Backend-specific configuration | ||
| pub backends: KeyValueStorageStructConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we combine these two fields into one? We've often used an enum with a config in each variant for something like this. Either way, this should be backend not backends, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you can see, the configuration of which storage we use is separated from all user-defined storage. This is because I also want to decouple the session state of RCAR in the future. In my opinion, storing session state in an in-memory database like Redis is better than storing it in a traditional database because access and changes are more frequent. I can simply add a field to specify the storage it needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the idea is to use different storage types for different components, why not specify the configuration for those for each component. We can have one global storage backend enum with a config inside it and then each component can have an optional storage backend enum.
Having multiple storage backends centrally is a bit confusing and what if you wanted to have two components use the same type of backend but with different configurations? This seems to assume that there would be at most one of each.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You are right that in this way we do not support different components to use different configurations of a same backend. I believe both methods have their pros and cons. I'll make examples then ask @bpradipt @lmilleri - who might be the direct consumer to trustee for their comments.
Way 1 - Fine grained configuration
[http_server]
sockets = ["0.0.0.0:8080"]
insecure_http = true
[admin]
type = "InsecureAllowAll"
[attestation_token]
[attestation_service]
type = "coco_as_builtin"
[attestation_service.attestation_token_broker]
duration_min = 5
# point 1: attestation service policy storage
[attestation_service.attestation_token_broker.policy_storage]
storage_type = "LocalJson"
file_path = "/opt/confidential-containers/attestation-service/ear-policies.json"
[attestation_service.rvps_config]
type = "BuiltIn"
# point 2: reference values storage
[attestation_service.rvps_config.reference_values_storage]
storage_type = "LocalFs"
file_path = "/opt/confidential-containers/attestation-service/reference_values"
[[plugins]]
name = "resource"
backend = "kvstorage"
# point 3: resource plugin storage
storage_type = "Memory"
# point 4: policy engine storage
[policy_engine]
storage_type = "Postgres"
db = "test"
table = "test"
username = "user"
port = 5432
host = "1270.0.1"Way 2 - An Overall configuration point
# Unified storage backend configuration
# This single configuration will be used for:
# - KBS policy engine (namespace: "kbs")
# - Resource plugin storage (namespace: "resource")
# - Built-in AS policy storage (namespace: "attestation-service-policy")
# - Built-in AS RVPS storage (namespace: "reference-value")
[storage_backend]
storage_type = "LocalFs"
[storage_backend.backends.local_fs]
dir_path = "/opt/confidential-containers/trustee"
[http_server]
sockets = ["0.0.0.0:8080"]
insecure_http = true
[admin]
type = "InsecureAllowAll"
[attestation_token]
[attestation_service]
type = "coco_as_builtin"
[attestation_service.attestation_token_broker]
duration_min = 5
[attestation_service.rvps_config]
type = "BuiltIn"
[[plugins]]
name = "resource"
backend = "kvstorage"Obviously,
| Ability | Way 1 | Way 2 |
|---|---|---|
| Set different storage point with different backends and different configurations | N | Y |
| When there is a change upon backend storage, we only need to change one position / The configuration is smaller | Y | N |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw after creating this table, I interviewed some users within Alibaba Cloud who use Trustee. They preferred Option 2 – they preferred the simpler configuration and they are not developers thus do not care about the complex cases... :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I'm misunderstanding, but could we offer a generic option that is overridable in the individual subsections? somewhat like e.g. working_directory field in github actions that you can define on job level and then override on individual steps?
Obviously, this only works if we have no relations between the entities that you would like to define in a schema. We should stop and be sure that we rule this out as a concept (e.g. a user_id on a policy_table). Hand-crafting transactions over relational data that doesn't sit in a common schema is very annoying and bug prone.
If we want to do this in the future, then I'd be leaning towards always prescribing a database and removing flexibility:
app: sqlite for local deployments, postgres otherwise. sessions: sled local development , redis otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, similar to @mkulke it seems like the easiest for users would be to have one global entry for the kvstorage. In most cases this is the only thing that users would adjust. All of our different components would default to using kvstorage backend, which would use the global config. It seems like this is the situation that we would want most people to use.
Then, optionally components like the rvps and resource plugin would accept a storage config. If this is set, a separate kvstorage would be created for this component with whatever config is specified. I guess this is closer to option 2 but not quite the same.
btw is the plan to keep supporting other storage backends besides kvstore, e.g. the existing json storage in the RVPS? In the examples above it's a little unclear when we are talking about the kvstore (and whatever backend it uses) and when we're talking about a different storage backend entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I will think about the override logic. @mkulke up to now we do not need this kind of abstraction, but we can explicitly get rid of non-db storages in future if we need.
btw @fitzthum do you mean the kvstorage under plugins section of way 2? It might not be a good name and it means to use the unified storage backend configuration. The other options for the backend can be pkcs11, etc.
And the original JSON storage is included as a plugin LocalJson in the key-value-storage crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, does it make sense to add the override logic after the current series of stateless refactoring work? I do not think it would require huge change, but prefer to finish the stateless work first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the plugin backend abstraction should stay since we have other implementations. For RVPS and other things I don't think we need that intermediate abstraction. They can simply always use a kv storage of some kind. Either the global one or one specified for that component.
I think it's probably fine to refactor the config later, although that is in some ways a breaking change.
To consolidate all backend storage configurations (postgres, local_json, local_fs) into a single configuration structure. This change prepares for centralizing all backend storage configuration sections in the config file. Changes: - Add StorageBackendConfig struct to support unified configuration across multiple components with instance-based data separation - Replace KeyValueStorageConfig enum with StorageBackendConfig - Add serde aliases for lowercase storage type variants - Make KeyValueStorageStructConfig fields public for better access - Add comprehensive documentation for unified storage backend configuration including instance concept, configuration structure, and backend-specific properties for LocalFs, LocalJson, Postgres, and Memory backends - Adds KeyValueStorageStructConfig with optional backend-specific config fields - Introduce ShimConfig for each backend to hold shared configuration - Add KeyValueStorageType enum to specify backend type - Replace to_key_value_storage() with to_client_with_namespace() to support multiple namespaces per backend - Remove storage field from PolicyEngineConfig, accept storage instance directly - Add InvalidConfiguration error variant This refactoring enables future work to unify all backend storage configuration sections in the configuration file. Signed-off-by: Xynnn007 <[email protected]>
- Remove KeyValueStorageConfig export (no longer needed) - Make PolicyEngine::new synchronous (storage is pre-initialized) Signed-off-by: Xynnn007 <[email protected]>
6eb129e to
583bb63
Compare
Thread #1092
I made some modifications to my original stateless trustee PR, mainly to address @mkulke 's point about keeping configuration files in one place as much as possible. To this end, I made changes to the policy-engine and key-value libraries, as well as the kbs/as/rvps libraries. I split this into two PRs; this first PR primarily involves changes to the underlying libraries. The overall presentation is in the PR
To sum up, the final config looks like (e.g. kbs)