Feat/spiffe add per-tenant identity configuration and token delegation gRPC APIs and a new DB table.#490
Conversation
610d7b7 to
e386435
Compare
…a and gRPC config APIs - initial draft checkin
e386435 to
60ca4cf
Compare
kensimon
left a comment
There was a problem hiding this comment.
I don't think we should be throwing an untyped Struct into the identity config here. It's one thing for the db column to be a JSONB blob, but the RPC call really needs to be strongly typed.
| /// Set identity config for an org. On first create, generates a placeholder key. | ||
| /// Caller must ensure tenant exists and global machine-identity is enabled. | ||
| #[allow(clippy::too_many_arguments)] | ||
| pub async fn set( |
There was a problem hiding this comment.
Could you create a struct for this rather than having 10 different arguments? The call sites look like a huge sea of strings and it's very difficult to know what is what. (Having to #[allow(clippy::too_many_arguments)] is a good sign this needs to be reworked.)
There was a problem hiding this comment.
agree. It should take TenantIdentityConfig or at least the parts of it that are actually settable. And that struct should get moved into api-model.
|
|
||
| // Identity configuration (per-org) | ||
| message GetIdentityConfigRequest { | ||
| string org_id = 1; |
There was a problem hiding this comment.
Please spell out the full organization_id here, and in every similar field in the proto. We shouldn't be changing spellings as we add new messages. (The only place we use "org_id" is in some search filters, which debatably have different naming standards.)
| .unwrap_or_else(|| full_hash.to_string()) | ||
| } | ||
|
|
||
| fn struct_to_json(pb: &Struct) -> serde_json::Value { |
There was a problem hiding this comment.
None of these struct_to_json, value_to_json, etc need to exist. Serde can already serialize protobuf objects.
| string org_id = 1; | ||
| string token_endpoint = 2; | ||
| string auth_method = 3; | ||
| google.protobuf.Struct auth_method_config = 4; // method-specific; never includes secrets in responses |
There was a problem hiding this comment.
Please don't use a google.protobuf.Struct, instead actually strongly-type this. Use a oneof field if you expect the contents to be different depending on the method.
|
|
||
| /// Builds response auth_method_config: omits secrets, passes through *_hash fields (stored in blob). | ||
| /// Truncates *_hash values to 8 hex chars + ".." for display in get_token_delegation. | ||
| fn build_response_auth_config(config: &serde_json::Value) -> serde_json::Value { |
There was a problem hiding this comment.
Don't pass serde_json::Value's around, this should all be strongly typed.
|
|
||
| #[derive(Debug, sqlx::FromRow)] | ||
| pub struct TenantIdentityConfig { | ||
| pub organization_id: String, |
There was a problem hiding this comment.
does this need to be here?
If the config is per tenant, then I think this struct could be without it, and you would have things like HashMap<TenantOrgId, TenantIdentitifyConfig>.
| pub issuer: String, | ||
| pub default_audience: String, | ||
| pub allowed_audiences: serde_json::Value, | ||
| pub token_ttl: i32, |
There was a problem hiding this comment.
consider adding a suffix that indictates the unit (e.g. _s) or making it std::time::Duration
| /// Set identity config for an org. On first create, generates a placeholder key. | ||
| /// Caller must ensure tenant exists and global machine-identity is enabled. | ||
| #[allow(clippy::too_many_arguments)] | ||
| pub async fn set( |
There was a problem hiding this comment.
agree. It should take TenantIdentityConfig or at least the parts of it that are actually settable. And that struct should get moved into api-model.
| /// Caller must ensure tenant exists and global machine-identity is enabled. | ||
| #[allow(clippy::too_many_arguments)] | ||
| pub async fn set( | ||
| org_id: &str, |
There was a problem hiding this comment.
There's a strongly typed TenantOrganizationId that we should probably use everywhere
| /// Set token delegation for an org. Identity config must exist first. | ||
| /// config_json: serialized auth_method_config. Stored as TEXT (future: encrypted). | ||
| pub async fn set_token_delegation( | ||
| org_id: &str, |
There was a problem hiding this comment.
There's a strongly typed TenantOrganizationId that we should probably use everywhere
| Ok(()) | ||
| } | ||
|
|
||
| async fn run_identity_config_tests( |
There was a problem hiding this comment.
I would recommend to add unit tests into /tests instead of doing the integration test thing, or at least in addition to that. The unit tests run much faster and are actually supposed to do things like raw DB reads. The integration tests should just use gRPC APIs and simulate customer interactions.
| .ok_or_else(|| Status::unauthenticated("No authentication context found"))?; | ||
|
|
||
| if !api.runtime_config.machine_identity.enabled { | ||
| return Err(Status::unavailable( |
There was a problem hiding this comment.
Prefer using CarbideError and convert these to tonic::Status using .into. That will lead to more consistent error messages and codes between handlers. E.g.
return Err(CarbideError::InvalidArgument("Machine identity must be enabled in site config".to_string()).into())
| string org_id = 1; | ||
| } | ||
|
|
||
| message IdentityConfigRequest { |
There was a problem hiding this comment.
This could be
message IdentityConfigRequest {
string org_id = 1;
IdentifyConfig config = 2;
}
with
message IdentityConfig {
bool enabled = 2;
string issuer = 3;
string default_audience = 4;
repeated string allowed_audiences = 5;
uint32 token_ttl = 6;
string subject_domain = 7;
bool rotate_key = 8;
}
| let cfg = api | ||
| .database_connection | ||
| .with_txn(|txn| { | ||
| Box::pin(async move { |
There was a problem hiding this comment.
we should increment a version field for the tenant object whenever we make a change
| rpc SignMachineIdentity(MachineIdentityRequest) returns (MachineIdentityResponse); | ||
| // Get, set, or delete per-org identity configuration (issuer, audiences, TTL, signing key) | ||
| rpc GetIdentityConfiguration(GetIdentityConfigRequest) returns (IdentityConfigResponse); | ||
| rpc SetIdentityConfiguration(IdentityConfigRequest) returns (IdentityConfigResponse); |
There was a problem hiding this comment.
If we'd model this similar to other APIs, it would probably a an UpdateTenantConfig request, which stores the new identify configuration as part of the tenant object.
I don't mind the separate API modifying it. But I think it should still increment the tenant object version number so that we are able to track changes.
Description
SPIFFE JWT-SVID machine identity support: add per-tenant identity configuration and token delegation via gRPC APIs and a new DB table.
1. Database Layer (
api-db)New:
crates/api-db/src/tenant_identity_config.rsTenantIdentityConfigstruct for per-org identity configset()– upsert identity config (issuer, audiences, TTL, signing key)find()– fetch config by orgdelete()– remove configset_token_delegation()– set token exchange config (endpoint, auth method, client secret)delete_token_delegation()– clear delegation configNew:
crates/api-db/migrations/20260225120000_tenant_identity_config.sqltenant_identity_configtable with:issuer,default_audience,allowed_audiences,token_ttl,subject_domain_prefix,enabledencrypted_signing_key,signing_key_public,key_id,algorithm,master_key_idcreated_at,updated_attoken_endpoint,auth_method,encrypted_auth_method_config,subject_token_audience,token_delegation_created_attenants(organization_id)withON DELETE CASCADE2. gRPC API (
rpc)New Proto Messages
GetIdentityConfiguration/SetIdentityConfiguration/DeleteIdentityConfigurationGetTokenDelegation/SetTokenDelegation/DeleteTokenDelegationGetIdentityConfigRequest,IdentityConfigRequest,IdentityConfigResponse,TokenDelegationRequest,TokenDelegationResponse,GetTokenDelegationRequest3. API Handlers (
handlers/identity_config.rs)New:
crates/api/src/handlers/identity_config.rs(657 lines)get_identity_configuration– read config by orgset_identity_configuration– upsert config with org validationdelete_identity_configuration– delete configget_token_delegation– read delegation configset_token_delegation– upsert delegation configdelete_token_delegation– clear delegationHelper Functions
compute_secret_hash()– SHA256 hash for secretstruncate_hash_for_display()– truncate hash for displaystruct_to_json/json_to_struct– protobuf ↔ JSONbuild_response_auth_config()– omit secrets from responsesUnit Tests (10)
compute_secret_hash,truncate_hash_for_displaystruct_to_json,json_to_struct,json_to_struct_roundtripbuild_response_auth_config_omits_client_secret,truncates_hash,passes_through_non_secret,non_object_returns_clone4. Configuration (
cfg/file.rs)New:
MachineIdentityConfigenabled,algorithm,token_ttl_min,token_ttl_max,token_endpoint_http_proxy[machine_identity]section inCarbideConfig5. Integration Test Support (
api-test-helper)New:
crates/api-test-helper/src/identity_config.rsset_identity_configuration(),get_identity_configuration(),delete_identity_configuration()set_token_delegation(),get_token_delegation(),delete_token_delegation()6. Integration Tests (
api-integration-tests)run_identity_config_tests()intests/lib.rstest_integration7. Fixes
api_fixtures/mod.rsmachine_identity: MachineIdentityConfig::default()toget_config()inCarbideConfig8. Documentation
book/src/design/machine-identity/spiffe-svid-sdd.mdFiles Changed (identity_config-related)
crates/api-db/src/tenant_identity_config.rscrates/api-db/migrations/20260225120000_tenant_identity_config.sqlcrates/api/src/handlers/identity_config.rscrates/api/src/handlers/mod.rscrates/api/src/handlers/machine_identity.rscrates/api/src/api.rscrates/api/src/cfg/file.rsMachineIdentityConfigcrates/api/src/tests/common/api_fixtures/mod.rsmachine_identitycrates/api-test-helper/src/identity_config.rscrates/api-test-helper/src/lib.rsidentity_configcrates/api-integration-tests/tests/lib.rsrun_identity_config_tests()crates/rpc/proto/forge.protobook/src/design/machine-identity/spiffe-svid-sdd.mdType of Change
Related Issues (Optional)
#447
Breaking Changes
Testing
Additional Notes
This PR is part of larger feature implementation related to #261.