[#10545] improvement(auth): Batch-load securable objects to eliminate N+1 in loadRolePrivilege#10546
[#10545] improvement(auth): Batch-load securable objects to eliminate N+1 in loadRolePrivilege#10546yuqi1129 wants to merge 1 commit intoapache:mainfrom
Conversation
…minate N+1 in loadRolePrivilege - Add listSecurableObjectsByRoleIds(List<Long>) to SecurableObjectMapper and SecurableObjectBaseSQLProvider: fetches securable objects for multiple roles in a single WHERE role_id IN (...) query. - Add RoleMetaService.batchListSecurableObjectsForRoles(List<Long>): issues the batch query and groups results by role ID; extracted buildSecurableObjectsFromPOs helper reused by both single and batch paths. - Rewrite JcasbinAuthorizer.loadRolePrivilege(): collect all unloaded role IDs, call the batch method once, load policies serially. Removes the per-role CompletableFuture + entityStore.get() pattern. Cold-path query count drops from 2+2N+T to 2+1+T (where N = roles, T = distinct object types). - Fix DBCP2 pool: minEvictableIdleTimeMillis 1000ms -> 30000ms, minIdle 0 -> 5, maxIdle 5 -> 10. Prevents connection churn that was adding 5-20ms per request. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR targets authorization-path performance by removing an N+1 pattern when loading role privileges in JcasbinAuthorizer.loadRolePrivilege(), and improves JDBC connection pool behavior by adjusting DBCP2 idle/eviction defaults.
Changes:
- Batch-load securable objects for multiple roles via a new
listSecurableObjectsByRoleIds(...)mapper/provider path andRoleMetaService.batchListSecurableObjectsForRoles(...). - Refactor
JcasbinAuthorizer.loadRolePrivilege()to load securable objects in one batch call and then load policies serially. - Tune DBCP2 pool settings (
minIdle,maxIdle,minEvictableIdleTimeMillis) to reduce reconnect churn.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| server-common/src/test/java/org/apache/gravitino/server/authorization/jcasbin/TestJcasbinAuthorizer.java | Updates tests to mock the new batch-loading path for role securable objects. |
| server-common/src/main/java/org/apache/gravitino/server/authorization/jcasbin/JcasbinAuthorizer.java | Replaces per-role loads with a single batch securable-object fetch and constructs full RoleEntity instances for policy loading. |
| core/src/main/java/org/apache/gravitino/storage/relational/session/SqlSessionFactoryHelper.java | Adjusts DBCP2 idle/eviction parameters to keep warm connections and avoid aggressive eviction. |
| core/src/main/java/org/apache/gravitino/storage/relational/service/RoleMetaService.java | Introduces batch API to load securable objects for multiple role IDs and refactors securable-object building into a shared helper. |
| core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/SecurableObjectBaseSQLProvider.java | Adds SQL for WHERE role_id IN (...) batch lookup. |
| core/src/main/java/org/apache/gravitino/storage/relational/mapper/SecurableObjectSQLProviderFactory.java | Exposes the new batch SQL provider method. |
| core/src/main/java/org/apache/gravitino/storage/relational/mapper/SecurableObjectMapper.java | Adds the MyBatis mapper method for batch securable-object listing. |
Comments suppressed due to low confidence (1)
core/src/main/java/org/apache/gravitino/storage/relational/service/RoleMetaService.java:447
buildSecurableObjectsFromPOs(...)now runs over the combined securable-object set for multiple roles. As written,objectIdscan contain many duplicates when different roles refer to the same metadata object, which can bloat the subsequent name-resolutionIN (...)queries (and may hit parameter limits). Consider de-duplicatingobjectIds(while preserving type grouping), e.g., by collecting into a set/distinct()before invoking the lookup function.
securableObjectPOs.stream()
.collect(Collectors.groupingBy(SecurableObjectPO::getType))
.forEach(
(type, objects) -> {
List<Long> objectIds =
objects.stream()
.map(SecurableObjectPO::getMetadataObjectId)
.collect(Collectors.toList());
| Map<Long, List<SecurableObject>> secObjsByRoleId = | ||
| RoleMetaService.batchListSecurableObjectsForRoles(unloadedRoleIds); |
There was a problem hiding this comment.
The new batch call to RoleMetaService.batchListSecurableObjectsForRoles(...) can now fail as a single operation, but any exception will be wrapped by AuthorizationRequestContext.loadRole() as RuntimeException("Failed to load role: ", e) without indicating which role(s) were being loaded. Previously the per-role entityStore.get(...) path added role-name context in the thrown exception. Consider catching failures around the batch call and rethrowing with actionable context (e.g., include metalake, username, and the list of unloadedRoleIds).
| Map<Long, List<SecurableObject>> secObjsByRoleId = | |
| RoleMetaService.batchListSecurableObjectsForRoles(unloadedRoleIds); | |
| Map<Long, List<SecurableObject>> secObjsByRoleId; | |
| try { | |
| secObjsByRoleId = RoleMetaService.batchListSecurableObjectsForRoles(unloadedRoleIds); | |
| } catch (RuntimeException e) { | |
| throw new RuntimeException( | |
| "Failed to batch-load securable objects for roles " + unloadedRoleIds | |
| + " of userId " + userId, | |
| e); | |
| } |
| /** | ||
| * Batch-loads securable objects for multiple roles in a single SQL query and returns a map from | ||
| * role ID to the resolved {@link SecurableObject} list. This eliminates the N+1 query pattern | ||
| * that occurs when loading securable objects for each role individually. | ||
| * | ||
| * @param roleIds the list of role IDs to load | ||
| * @return a map from role ID to its list of resolved securable objects | ||
| */ | ||
| @Monitored( | ||
| metricsSource = GRAVITINO_RELATIONAL_STORE_METRIC_NAME, | ||
| baseMetricName = "batchListSecurableObjectsForRoles") | ||
| public static Map<Long, List<SecurableObject>> batchListSecurableObjectsForRoles( | ||
| List<Long> roleIds) { | ||
| if (roleIds.isEmpty()) { | ||
| return ImmutableMap.of(); | ||
| } | ||
| List<SecurableObjectPO> allPOs = | ||
| SessionUtils.getWithoutCommit( | ||
| SecurableObjectMapper.class, mapper -> mapper.listSecurableObjectsByRoleIds(roleIds)); | ||
|
|
||
| Map<Long, List<SecurableObjectPO>> byRoleId = | ||
| allPOs.stream().collect(Collectors.groupingBy(SecurableObjectPO::getRoleId)); | ||
|
|
||
| ImmutableMap.Builder<Long, List<SecurableObject>> builder = ImmutableMap.builder(); | ||
| for (Long roleId : roleIds) { | ||
| List<SecurableObjectPO> pos = byRoleId.getOrDefault(roleId, Collections.emptyList()); | ||
| builder.put(roleId, buildSecurableObjectsFromPOs(pos)); | ||
| } | ||
| return builder.build(); | ||
| } |
There was a problem hiding this comment.
New behavior (batchListSecurableObjectsForRoles) + new mapper/provider path (listSecurableObjectsByRoleIds) is not covered by any unit/integration test in this repo (no tests reference the new method). Per project guidelines, please add coverage in TestRoleMetaService to validate correctness (multiple role IDs, missing roles returning empty lists) and that it works across backends (H2/PostgreSQL).
| com.google.common.collect.ImmutableMap.Builder<Long, List<SecurableObject>> result = | ||
| com.google.common.collect.ImmutableMap.builder(); | ||
| for (Long id : ids) { |
There was a problem hiding this comment.
Avoid using a fully-qualified class name inside the method body (com.google.common.collect.ImmutableMap.Builder). Please add an import for ImmutableMap and reference it directly to match the repo's Java import convention.
| private void loadRolePrivilege( | ||
| String metalake, String username, Long userId, AuthorizationRequestContext requestContext) { | ||
| requestContext.loadRole( | ||
| () -> { | ||
| EntityStore entityStore = GravitinoEnv.getInstance().entityStore(); | ||
| NameIdentifier userNameIdentifier = NameIdentifierUtil.ofUser(metalake, username); | ||
| List<RoleEntity> entities; | ||
| List<RoleEntity> roleStubs; | ||
| try { | ||
| entities = | ||
| roleStubs = | ||
| entityStore | ||
| .relationOperations() | ||
| .listEntitiesByRelation( | ||
| SupportsRelationOperations.Type.ROLE_USER_REL, | ||
| userNameIdentifier, | ||
| Entity.EntityType.USER); | ||
| List<CompletableFuture<Void>> loadRoleFutures = new ArrayList<>(); | ||
| for (RoleEntity role : entities) { | ||
| Long roleId = role.id(); | ||
| allowEnforcer.addRoleForUser(String.valueOf(userId), String.valueOf(roleId)); | ||
| denyEnforcer.addRoleForUser(String.valueOf(userId), String.valueOf(roleId)); | ||
| if (loadedRoles.getIfPresent(roleId) != null) { | ||
| continue; | ||
| } | ||
| CompletableFuture<Void> loadRoleFuture = | ||
| CompletableFuture.supplyAsync( | ||
| () -> { | ||
| try { | ||
| return entityStore.get( | ||
| NameIdentifierUtil.ofRole(metalake, role.name()), | ||
| Entity.EntityType.ROLE, | ||
| RoleEntity.class); | ||
| } catch (Exception e) { | ||
| throw new RuntimeException("Failed to load role: " + role.name(), e); | ||
| } | ||
| }, | ||
| executor) | ||
| .thenAcceptAsync( | ||
| roleEntity -> { | ||
| loadPolicyByRoleEntity(roleEntity); | ||
| loadedRoles.put(roleId, true); | ||
| }, | ||
| executor); | ||
| loadRoleFutures.add(loadRoleFuture); | ||
| } | ||
| CompletableFuture.allOf(loadRoleFutures.toArray(new CompletableFuture[0])).join(); | ||
| } catch (IOException e) { | ||
| throw new RuntimeException(e); | ||
| } | ||
|
|
||
| // Register user-role associations in enforcers for all roles. | ||
| for (RoleEntity role : roleStubs) { | ||
| allowEnforcer.addRoleForUser(String.valueOf(userId), String.valueOf(role.id())); | ||
| denyEnforcer.addRoleForUser(String.valueOf(userId), String.valueOf(role.id())); | ||
| } | ||
|
|
||
| // Collect stubs for roles whose policies have not yet been loaded into the enforcer. | ||
| List<RoleEntity> unloadedRoleStubs = | ||
| roleStubs.stream() | ||
| .filter(role -> loadedRoles.getIfPresent(role.id()) == null) | ||
| .collect(Collectors.toList()); | ||
| if (unloadedRoleStubs.isEmpty()) { | ||
| return; | ||
| } | ||
|
|
||
| // Batch-fetch securable objects for all unloaded roles in a single query, | ||
| // eliminating the N+1 pattern of per-role entityStore.get() calls. | ||
| List<Long> unloadedRoleIds = | ||
| unloadedRoleStubs.stream().map(RoleEntity::id).collect(Collectors.toList()); | ||
| Map<Long, List<SecurableObject>> secObjsByRoleId = | ||
| RoleMetaService.batchListSecurableObjectsForRoles(unloadedRoleIds); | ||
|
|
||
| for (RoleEntity stub : unloadedRoleStubs) { | ||
| List<SecurableObject> securableObjects = | ||
| secObjsByRoleId.getOrDefault(stub.id(), Collections.emptyList()); | ||
| RoleEntity fullRole = | ||
| RoleEntity.builder() | ||
| .withId(stub.id()) | ||
| .withName(stub.name()) | ||
| .withNamespace(stub.namespace()) | ||
| .withProperties(stub.properties()) | ||
| .withAuditInfo(stub.auditInfo()) | ||
| .withSecurableObjects(securableObjects) | ||
| .build(); | ||
| loadPolicyByRoleEntity(fullRole); | ||
| loadedRoles.put(stub.id(), true); | ||
| } |
There was a problem hiding this comment.
loadRolePrivilege() no longer uses the executor thread pool (the previous CompletableFuture path was removed), but initialize() still creates a fixed thread pool (default 100 threads) and close() shuts it down. This now allocates idle threads for no benefit and increases memory/CPU footprint. Consider removing the executor field + thread pool initialization entirely (and any related test reflection), or reintroduce an async use of it if still needed.
Code Coverage Report
Files
|
What changes were proposed in this pull request?
This PR eliminates the N+1 query pattern in
JcasbinAuthorizer.loadRolePrivilege()and fixes a DBCP2 connection pool misconfiguration.Auth N+1 fix:
listSecurableObjectsByRoleIds(List<Long>)toSecurableObjectMapperandSecurableObjectBaseSQLProvider, which fetches securable objects for multiple roles in a singleWHERE role_id IN (...)query.RoleMetaService.batchListSecurableObjectsForRoles(List<Long>)that issues this single batch query and groups results by role ID. ExtractedbuildSecurableObjectsFromPOshelper reused by both the single and batch paths.JcasbinAuthorizer.loadRolePrivilege()to collect all unloaded role IDs, call the batch method once, then load policies serially. Removed the per-roleCompletableFuture+entityStore.get()pattern.DBCP2 connection pool fix:
minEvictableIdleTimeMillis:1000ms → 30000ms— prevents connections from being destroyed after 1 second idle, eliminating constant reconnect churn.minIdle:0 → 5— keeps a pool of warm connections ready.maxIdle:5 → 10— allows more idle connections to be retained.Query count improvement (cold path):
2 + 2N + T2 + 1 + TWhy are the changes needed?
Fix: #10545
Each call to
loadRolePrivilegeissued2NDB queries (one per role for role metadata, one per role for securable objects). WithN=5roles andT=3object types, that's 16 queries per authorization check. This is a bottleneck under high concurrency or when the cache is cold (e.g., after a restart or TTL expiry in HA deployments).The DBCP2 misconfiguration caused connection destroy-then-reconnect on nearly every request (1s idle eviction with
minIdle=0), adding ~5–20ms latency per request.Does this PR introduce any user-facing change?
No API changes. The connection pool defaults change slightly (
minIdle,maxIdle,minEvictableIdleTimeMillis) which improves performance transparently.How was this patch tested?
TestRoleMetaService— all 33 tests pass (H2 and PostgreSQL backends).TestJcasbinAuthorizer— all 7 tests pass. Updated test to mockRoleMetaService.batchListSecurableObjectsForRolesinstead of per-roleentityStore.get().