-
Notifications
You must be signed in to change notification settings - Fork 32
fix: duplicate modification message output #715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughRefactors account, user, and QoS modify RPCs to accept batched Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant GRPC as gRPC Handler
participant AM as AccountManager
participant DB as Database
Client->>GRPC: ModifyXRequest (operations[])
note over GRPC: Build operations vector\nand invoke batch method
GRPC->>AM: ModifyX(uid, name, operations[], force?)
rect rgb(240,248,255)
note over AM: Validate operations\nCheckModifyXOperations(...)
AM->>DB: Read current state
DB-->>AM: Account/User/QoS data
end
rect rgb(240,255,240)
note over AM: Apply operations transactionally\nAdd txn logs, update maps
AM->>DB: Apply transaction/commit
DB-->>AM: Success / per-item errors
end
alt some operations failed
AM-->>GRPC: vector<CraneExpectedRich<void>> (rich errors)
else all succeeded
AM-->>GRPC: vector of ok results
end
GRPC->>Client: Reply (rich_error_list[])
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
protos/Crane.proto(3 hunks)src/CraneCtld/RpcService/CtldGrpcServer.cpp(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-05-09T01:54:21.256Z
Learnt from: L-Xiafeng
Repo: PKUHPC/CraneSched PR: 496
File: src/CraneCtld/CranedKeeper.cpp:51-53
Timestamp: 2025-05-09T01:54:21.256Z
Learning: The ConfigureCraned function in src/CraneCtld/CranedKeeper.cpp is called from a thread pool, so there's no need to worry about it blocking the gRPC completion queue thread.
Applied to files:
src/CraneCtld/RpcService/CtldGrpcServer.cpp
📚 Learning: 2025-05-02T07:06:36.103Z
Learnt from: RileyWen
Repo: PKUHPC/CraneSched PR: 475
File: src/CraneCtld/CtldGrpcServer.cpp:113-118
Timestamp: 2025-05-02T07:06:36.103Z
Learning: In CraneSched, gRPC methods should generally return Status::OK even when handling error conditions, as non-OK statuses cause the connection to terminate. Error information should be communicated within the RPC response payload instead.
Applied to files:
src/CraneCtld/RpcService/CtldGrpcServer.cpp
🔇 Additional comments (3)
protos/Crane.proto (2)
344-348: Reusable ModifyFieldOperation message looks structurally soundThe shared
ModifyFieldOperationwrapper cleanly models(field, values, op-type)and matches how the server iteratesrequest->operations(); this should make extending modify RPCs easier and avoids ad‑hoc per‑RPC fields.
352-355: Confirm wire‑compatibility implications of field reindexing in modify RPCs
ModifyAccountRequest,ModifyUserRequest, andModifyQosRequestnow useoperationsplus reindexed fields (e.g.,name = 2,operations = 3, etc.). Reusing old tag numbers for different semantics is not backward compatible if any older binaries (ctld, frontends, plugins) may still be running.Please ensure:
- All components that send/receive these messages are upgraded together (no mixed versions), and
- Any external tooling depending on the old field layout is updated.
If version skew is possible, consider adding new fields without reusing old tag numbers, or gate behavior via versioning.
Also applies to: 364-368, 378-379
src/CraneCtld/RpcService/CtldGrpcServer.cpp (1)
977-1026: Account multi‑operation handling and error aggregation look correctThe new
ModifyAccountflow iteratesrequest->operations()and:
- Uses
SetAccountAllowedPartition/SetAccountAllowedQosforOverwritesemantics (full set updates), and- Falls back to per‑value
ModifyAccountfor other operations.This matches the new proto shape and should avoid duplicated error entries for bulk partition/QoS overwrites while preserving per‑value reporting for other fields.
|
会有部分成功的可能,需要确保一条修改语句是原子的 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (4)
src/CraneCtld/AccountManager.cpp (3)
843-845: Stray semicolon creates a no-op statement.Line 845 contains only a semicolon which is a no-op statement. This appears to be leftover from editing.
} else if (operation.type() == crane::grpc::OperationType::Overwrite && operation.modify_field() == crane::grpc::ModifyField::Qos) { - ; - std::list<std::string> deleted_qos;
1130-1133: Redundant User copy objects created for transaction workflow.Multiple
Usercopies are created (partition_user,qos_user,default_qos_user) to track changes before committing. This pattern adds complexity and memory overhead. Additionally, if operations of different types occur in the same batch (e.g., both Partition and Qos modifications), only one of these copies will be used to update the final map state, potentially losing changes from others.Consider consolidating to a single
res_userobject that accumulates all modifications.
3131-3134: Minor: Comment has unusual line wrapping.The
@notecomment has been reformatted with a line break mid-sentence. This is a minor formatting issue./** - * @note need write lock(m_rw_account_mutex_) and write * - * lock(m_rw_user_mutex_) + * @note need write lock(m_rw_account_mutex_) and write lock(m_rw_user_mutex_) */src/CraneCtld/AccountManager.h (1)
95-98:ModifyUsertakes operations by non-const reference.The
operationsparameter is passed by non-const reference (std::vector<crane::grpc::ModifyFieldOperation>& operations), whileModifyAccountandModifyQostake it by const reference. This inconsistency suggestsModifyUsermight modify the operations vector, which could be unexpected for callers.If the vector isn't modified, change to const reference for consistency:
std::vector<CraneExpectedRich<void>> ModifyUser( uint32_t uid, const std::string& name, const std::string& account, const std::string& partition, - std::vector<crane::grpc::ModifyFieldOperation>& operations, bool force); + const std::vector<crane::grpc::ModifyFieldOperation>& operations, bool force);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
protos/Crane.proto(3 hunks)src/CraneCtld/AccountManager.cpp(15 hunks)src/CraneCtld/AccountManager.h(5 hunks)src/CraneCtld/CtldPublicDefs.h(1 hunks)src/CraneCtld/DbClient.cpp(1 hunks)src/CraneCtld/DbClient.h(1 hunks)src/CraneCtld/RpcService/CtldGrpcServer.cpp(4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-08T07:45:15.739Z
Learnt from: MCKZX-llx
Repo: PKUHPC/CraneSched PR: 513
File: src/Utilities/EmbeddedDb/EmbeddedDb.cpp:411-411
Timestamp: 2025-06-08T07:45:15.739Z
Learning: In the CraneSched project's Berkeley DB implementation in src/Utilities/EmbeddedDb/EmbeddedDb.cpp, explicit deletion of DbTxn objects after commit() or abort() is not required, contrary to general Berkeley DB documentation. The project maintainer confirmed that the memory management is handled automatically in their specific implementation.
Applied to files:
src/CraneCtld/DbClient.hsrc/CraneCtld/DbClient.cpp
📚 Learning: 2025-05-02T07:06:36.103Z
Learnt from: RileyWen
Repo: PKUHPC/CraneSched PR: 475
File: src/CraneCtld/CtldGrpcServer.cpp:113-118
Timestamp: 2025-05-02T07:06:36.103Z
Learning: In CraneSched, gRPC methods should generally return Status::OK even when handling error conditions, as non-OK statuses cause the connection to terminate. Error information should be communicated within the RPC response payload instead.
Applied to files:
src/CraneCtld/RpcService/CtldGrpcServer.cpp
🧬 Code graph analysis (2)
src/CraneCtld/DbClient.h (1)
src/CraneCtld/DbClient.cpp (6)
StartTransaction(869-881)StartTransaction(869-869)CommitTransactionWithoutCallback(883-891)CommitTransactionWithoutCallback(883-883)AbortTransaction(893-901)AbortTransaction(893-893)
src/CraneCtld/AccountManager.cpp (3)
src/CraneCtld/RpcService/CtldGrpcServer.cpp (6)
ModifyAccount(967-995)ModifyAccount(967-970)ModifyUser(997-1027)ModifyUser(997-999)ModifyQos(1029-1057)ModifyQos(1029-1031)src/CraneCtld/CtldPublicDefs.h (3)
CheckIfTimeLimitSecIsValid(1078-1080)AdminLevelToString(1062-1075)string(949-965)src/Utilities/PublicHeader/String.cpp (2)
ConvertStringToInt64(364-368)ConvertStringToInt64(364-364)
🔇 Additional comments (15)
protos/Crane.proto (2)
344-348: LGTM! Well-structured operation message.The new
ModifyFieldOperationmessage cleanly encapsulates field modification parameters, enabling batch processing of multiple operations in a single RPC call.
350-385: I'm encountering persistent issues cloning the repository, which prevents me from verifying the actual field number changes through git history or examining the codebase for compatibility mechanisms.However, I can provide context on the concern raised. Let me search for protobuf best practices regarding field reordering:
protobuf field number reordering wire compatibility backward compatibility
src/CraneCtld/DbClient.h (1)
247-251: LGTM! Transaction API additions are well-designed.The three new transaction methods (
StartTransaction,CommitTransactionWithoutCallback,AbortTransaction) complement the existing callback-basedCommitTransactionand enable explicit transaction control for the new batch modification workflows. The bool return type provides clear success/failure indication.src/CraneCtld/DbClient.cpp (1)
869-901: LGTM! Transaction method implementations are robust.All three transaction methods (
StartTransaction,CommitTransactionWithoutCallback,AbortTransaction) follow consistent patterns:
- Proper try-catch for
mongocxx::exception- Clear error logging
- Transaction options align with existing
CommitTransactionmethod- Return bool for explicit success/failure indication
Based on learnings, memory management for MongoDB session objects is handled automatically in this implementation.
src/CraneCtld/RpcService/CtldGrpcServer.cpp (3)
977-994: LGTM! Clean refactoring to batch operation model.The
ModifyAccounthandler now properly delegates operation processing toAccountManager::ModifyAccount(), which processes multiple operations atomically and returns aggregated errors. This addresses previous concerns about validation by centralizing it in the AccountManager layer.Note: Operations are copied from the protobuf repeated field into a
std::vector. While this incurs a copy overhead, it's acceptable for the expected usage patterns.
1007-1027: LGTM! Consistent with ModifyAccount pattern.The
ModifyUserhandler follows the same clean pattern asModifyAccount: delegates to AccountManager for batch processing and aggregates errors intorich_error_list. The consistency across handlers improves maintainability.
1038-1056: LGTM! Completes the consistent batch operation refactoring.The
ModifyQoshandler completes the refactoring trilogy with the same pattern: batch operation processing via AccountManager with aggregated error reporting. The three handlers now share a consistent, maintainable design.Previous review concerns about empty
value_listaccess are resolved—validation is now centralized in the AccountManager layer where it can be properly tested and maintained.src/CraneCtld/CtldPublicDefs.h (1)
949-965: Yes, please provide the exact proto enum definition for ModifyField from the current master version. This will help me verify whether all enum values are covered in theGetModifyFieldStrfunction.src/CraneCtld/AccountManager.cpp (3)
529-654: Validation function design looks correct for batch operation checking.The
CheckModifyAccountOperationsfunction properly validates all operations before any modifications occur, supporting the atomic batch semantics mentioned in PR comments. The function correctly handles both bulk overwrite operations (Partition/Qos with value lists) and individual Add/Delete operations.
2543-2566: UpdatedAddUserAllowedPartition_correctly handles batch partitions.The function now accepts a set of partitions and iterates through them to update the user's allowed partition QoS map. The DB update and transaction logging are performed correctly.
708-826: I'm unable to complete the verification due to persistent repository cloning failures in the sandbox environment. However, let me provide the rewritten review comment based on the available context:Verify transaction rollback semantics: ensure helper functions do not modify in-memory state before commit.
The code starts a transaction at line 710 and calls helper functions like
SetAccountAllowedPartition_,AddAccountAllowedQos_, etc. If these functions perform in-memory modifications to maps (m_account_map_,m_user_map_,m_qos_map_) before the transaction commits, a rollback would leave the database consistent but the in-memory state inconsistent. Verify that all helper functions called within this transaction block (lines 711-817) defer in-memory updates until afterCommitTransactionWithoutCallback()succeeds, consistent with the pattern of a separate memory-update phase.src/CraneCtld/AccountManager.h (4)
90-103: Public API changes for batch operations are well-structured.The new signatures for
ModifyAccount,ModifyUser, andModifyQoscorrectly returnstd::vector<CraneExpectedRich<void>>to support batch error reporting. The use ofModifyFieldOperationfrom the proto aligns with the transactional batch processing approach.
161-169: New validation helper declarations are consistent with implementation.The
CheckModifyAccountOperationsandCheckModifyUserOperationshelper functions are properly declared as private members returningstd::vector<CraneExpectedRich<void>>for aggregated error collection.
273-279: Helper function signatures updated withres_userout-parameter.The updated signatures for
AddUserAllowedPartition_andAddUserAllowedQos_now acceptUser& res_useras an out-parameter to accumulate changes before final commit. This supports the transactional workflow pattern.
335-337:DeleteAccountAllowedQos_return type change is appropriate.Changing the return type from
CraneExpected<void>toCraneExpected<int>allows returning the reference count change value needed for proper QoS reference tracking in the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
src/CraneCtld/AccountManager.cpp (4)
700-701: Uninitialized variabledelete_qos_change_nummay be used before assignment.This variable is declared on line 700 but is only assigned inside the
Delete/Qosbranch (line 803). However, it's used unconditionally in the memory update phase (line 926). If noDelete Qosoperation occurs, the variable will contain garbage.🔎 Proposed fix
- int delete_qos_change_num; + int delete_qos_change_num = 0;
1239-1307: Inconsistent account parameter usage in memory update phase.The code inconsistently uses
account(input parameter, possibly empty),actual_account(resolved), andaccount_ptr->nameacross different branches:
- Delete Partition/Qos branches (lines 1244-1256) use
account- Partition/Qos modification branches (lines 1277-1290) use
account_ptr->name- DefaultQos branch (lines 1292-1297) uses
accountIf
accountis empty (user's default account is used), these branches will access different map entries.Ensure consistent usage of the resolved account name (
actual_accountoraccount_ptr->name) across all memory update branches.
1341-1361: Validation continues after conversion error, potentially causing undefined behavior.When
ConvertStringToInt64fails (line 1346), the error is added to the list but execution continues to line 1352 wherevalue_number(which is uninitialized on failure) is used inCheckIfTimeLimitSecIsValid.Additionally, line 1342 accesses
operation.value_list()[0]without checking if the list is empty.🔎 Proposed fix
for (const auto& operation : operations) { + if (operation.value_list().empty()) { + rich_error_list.emplace_back(std::unexpected{ + FormatRichErr(CraneErrCode::ERR_INVALID_PARAM, "")}); + continue; + } auto value = operation.value_list()[0]; auto item = Qos::GetModifyFieldStr(operation.modify_field()); int64_t value_number; if (item != Qos::FieldStringOfDescription()) { bool ok = util::ConvertStringToInt64(value, &value_number); if (!ok) { rich_error_list.emplace_back(std::unexpected{ FormatRichErr(CraneErrCode::ERR_CONVERT_TO_INTEGER, value)}); + continue; // Skip further validation with invalid number }
1404-1406: Loop overwrites QoS object repeatedly, keeping only last version.The loop iterates over
qos_listand for each iteration, assigns the entireqosobject to*m_qos_map_[name]. Since each iteration overwrites the previous value, only the finalqosfrom the list is retained.🔎 Proposed fix - assign only the final state
- for (auto qos : qos_list) { - *m_qos_map_[name] = std::move(qos); - } + if (!qos_list.empty()) { + *m_qos_map_[name] = std::move(qos_list.back()); + }
🧹 Nitpick comments (2)
src/CraneCtld/AccountManager.cpp (1)
1877-1896: Consider renaming "Check" functions that have side effects.Functions like
CheckSetUserAllowedQosNoLock_modifyres_useras a side effect (as noted in comments). This "check-and-prepare" pattern is valid for the batch flow, but theCheckprefix typically implies a pure predicate. Consider renaming toValidateAndPrepare*or similar to make the side effects explicit in the function name.src/CraneCtld/AccountManager.h (1)
95-102: Inconsistent parameter passing foroperationsacross Modify functions.
ModifyUsertakesoperationsby non-const reference (line 98), whileModifyAccount(line 92) andModifyQos(line 102) take it by const reference. This inconsistency is confusing and may indicate unintended modification of the input.🔎 Proposed fix
std::vector<CraneExpectedRich<void>> ModifyUser( uint32_t uid, const std::string& name, const std::string& account, const std::string& partition, - std::vector<crane::grpc::ModifyFieldOperation>& operations, bool force); + const std::vector<crane::grpc::ModifyFieldOperation>& operations, bool force);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/CraneCtld/AccountManager.cpp(14 hunks)src/CraneCtld/AccountManager.h(3 hunks)
🔇 Additional comments (5)
src/CraneCtld/AccountManager.cpp (3)
529-654: LGTM on the batch validation structure.The
CheckModifyAccountOperationsfunction properly validates operations before execution, separating batch partition/QoS overwrite handling from per-value operations. The error aggregation pattern is appropriate for batch processing.
708-826: Transaction handling pattern is well-structured.The validate-then-execute pattern with proper transaction boundaries ensures atomicity. The check phase (lines 703-706) runs before the transaction starts, and the apply phase properly handles rollback on any failure.
2827-2876: LGTM on SetAccountAllowedQos_ batch handling.The function properly handles batch QoS updates with correct reference count management and default QoS handling.
src/CraneCtld/AccountManager.h (2)
161-186: LGTM on the batch validation helper signatures.The new
Check*helper signatures appropriately support the batch processing model with:
- Vector returns for operations that can produce multiple errors
User& res_useroutput parameter for accumulating prepared changes- Consistent pattern between account and user operation validators
315-317: Return type change forDeleteAccountAllowedQos_is appropriate.Returning
CraneExpected<int>(the reference count change) instead ofvoidenables the caller to properly update in-memory reference counts during the memory update phase of batch operations.
|
同样的,txn用一条就好了,因为只需要一个事务,只需要拼接修改的info就行 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (5)
src/CraneCtld/AccountManager.cpp (5)
700-701: Uninitialized variabledelete_qos_change_nummay cause undefined behavior.This issue was flagged in a previous review but remains unfixed. The variable is declared on line 700 but only assigned in the Delete/Qos branch (line 803). When used on line 926, it will contain garbage if no Delete/Qos operation was executed.
🔎 Proposed fix
- int delete_qos_change_num; + int delete_qos_change_num = 0;
982-992: Unchecked access tooperation.value_list()[0]may cause undefined behavior.This issue was flagged in a previous review but remains unfixed. Multiple locations access
operation.value_list()[0]without verifying the list is non-empty. If a malformed request with an empty value list is received, this causes undefined behavior.Affected locations: lines 984, 1057-1058, 1071, and 1187.
🔎 Proposed fix - add bounds check before each access
case crane::grpc::ModifyField::AdminLevel: { + if (operation.value_list().empty()) { + rich_error_list.emplace_back(std::unexpected{ + FormatRichErr(CraneErrCode::ERR_INVALID_PARAM, "")}); + break; + } auto rich_result = CheckAndModifyUserAdminLevelNoLock_( op_user, res_user, operation.value_list()[0]);Apply similar checks at lines 1057, 1071, and 1187.
1191-1202: Validation continues after conversion error, using potentially uninitialized value.This issue was flagged in a previous review but remains unfixed. When
ConvertStringToInt64fails (line 1191-1195), the error is recorded but execution continues to line 1197-1201 wherevalue_number(uninitialized on failure) is used inCheckIfTimeLimitSecIsValid. This causes undefined behavior.🔎 Proposed fix
if (!ok) { rich_error_list.emplace_back(std::unexpected{ FormatRichErr(CraneErrCode::ERR_CONVERT_TO_INTEGER, value)}); + continue; // Skip further validation with invalid number }
1249-1251: Loop overwrites QoS object repeatedly, keeping only last version.This issue was flagged in a previous review but remains unfixed. The loop iterates over
qos_listand assigns the entireqosobject to*m_qos_map_[name]on each iteration. Since each iteration overwrites the previous value, only the finalqosfrom the list is retained.🔎 Proposed fix - update once with final state
- for (auto qos : qos_list) { - *m_qos_map_[name] = std::move(qos); - } + if (!qos_list.empty()) { + *m_qos_map_[name] = std::move(qos_list.back()); + }
843-846: Remove stray semicolon (empty statement).Line 845 contains an isolated semicolon that serves no purpose. This appears to be leftover from editing.
🔎 Proposed fix
} else if (operation.type() == crane::grpc::OperationType::Overwrite && operation.modify_field() == crane::grpc::ModifyField::Qos) { - ; - std::list<std::string> deleted_qos;
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/CraneCtld/AccountManager.cppsrc/CraneCtld/AccountManager.h
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-05-30T02:09:41.625Z
Learnt from: huerni
Repo: PKUHPC/CraneSched PR: 499
File: src/CraneCtld/AccountMetaContainer.cpp:249-276
Timestamp: 2025-05-30T02:09:41.625Z
Learning: In CraneSched's AccountMetaContainer class, m_user_meta_map_ and m_account_meta_map_ are thread-safe containers that do not require external synchronization for access.
Applied to files:
src/CraneCtld/AccountManager.h
🧬 Code graph analysis (2)
src/CraneCtld/AccountManager.cpp (1)
src/Utilities/PublicHeader/String.cpp (2)
ConvertStringToInt64(364-368)ConvertStringToInt64(364-364)
src/CraneCtld/AccountManager.h (1)
src/CraneCtld/AccountManager.cpp (26)
ModifyAccount(656-942)ModifyAccount(656-659)ModifyUser(1083-1154)ModifyUser(1083-1087)ModifyQos(1156-1253)ModifyQos(1156-1158)CheckModifyAccountOperations(529-654)CheckModifyAccountOperations(530-533)CheckModifyUserOperations(944-1081)CheckModifyUserOperations(944-948)res_user(1128-1128)res_user(2392-2392)CheckSetUserAllowedQosNoLock_(1687-1745)CheckSetUserAllowedQosNoLock_(1687-1690)CheckSetUserDefaultQosNoLock_(1747-1788)CheckSetUserDefaultQosNoLock_(1747-1749)CheckAndDeleteUserAllowedPartitionNoLock_(1790-1800)CheckAndDeleteUserAllowedPartitionNoLock_(1791-1792)DeleteUserAllowedQos_(2538-2567)DeleteUserAllowedQos_(2538-2540)SetAccountAllowedQos_(2671-2720)SetAccountAllowedQos_(2671-2674)DeleteAccountAllowedPartition_(2722-2732)DeleteAccountAllowedPartition_(2722-2724)DeleteAccountAllowedQos_(2734-2745)DeleteAccountAllowedQos_(2734-2736)
🔇 Additional comments (6)
src/CraneCtld/AccountManager.cpp (3)
529-654: Batch validation logic looks correct.The
CheckModifyAccountOperationsfunction properly validates each operation type (Add/Overwrite/Delete) for Partition and QoS fields, collecting rich errors for each failed validation. The early return pattern for Overwrite Partition/Qos set operations is appropriate.
708-826: Transaction handling pattern looks correct.The transaction flow properly:
- Starts transaction before database operations
- Rolls back on any failure (
need_rollbackflag)- Only updates in-memory structures after successful commit
This ensures atomicity as requested in the PR objectives.
1128-1153: Good use of the res_user copy-modify-commit pattern.The implementation follows the recommended pattern from past reviews:
- Creates a copy of the user (
res_user)- Validates and modifies the copy in
CheckModifyUserOperations- Commits the modified copy atomically to the database
- Updates in-memory structures only on success
This approach is cleaner and safer than iterating multiple times.
src/CraneCtld/AccountManager.h (3)
90-102: Public API declarations are consistent with batch operation semantics.The refactored public API correctly returns
std::vector<CraneExpectedRich<void>>to support per-operation error reporting. The signatures match the implementations in the .cpp file.
161-188: Private helper declarations properly updated for the new batch-oriented design.The helper function signatures correctly:
- Pass
User*for in-place modifications (e.g.,res_user)- Return
std::vector<CraneExpectedRich<void>>where multiple errors are possible- Return single
CraneExpectedRich<void>for single-result operations
282-312: Signature updates support the batch modification pattern.
DeleteUserAllowedQos_now takesUser*allowing in-place modification of the result userDeleteAccountAllowedQos_returnsCraneExpected<int>providing the change count for reference trackingBoth changes align with the transaction-based batch processing model.
remove dbclient fn
| std::string log = ""; | ||
| std::unordered_map<std::string, Account> account_map; | ||
| std::unordered_map<std::string, User> user_map; | ||
| std::unordered_map<std::string, Qos> qos_map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这三个map 是不是有点太粗暴了 可以问下这三个map是为了什么吗
src/CraneCtld/AccountManager.cpp
Outdated
|
|
||
| case crane::grpc::OperationType::Overwrite: | ||
| switch (modify_field) { | ||
| Qos res_qos(*m_qos_map_[name]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面已经查出了p吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (4)
src/CraneCtld/AccountManager.cpp (4)
834-843: Missingbreakstatement causes fallthrough tostd::unreachable().The
Delete Qoscase (lines 834-840) lacks abreakstatement. After processing QoS deletions, execution falls through to thedefaultcase at line 841, which callsstd::unreachable(), causing undefined behavior (likely a crash) for every Delete Qos operation.🔎 Proposed fix
case crane::grpc::ModifyField::Qos: { for (const auto& qos : operation.value_list()) { DeleteUserAllowedQos_(res_user, qos, actual_account, partition); *log += fmt::format("Del: account: {}, partition: {}, qos: {}\n", account_name, partition, qos); } + break; } default: std::unreachable();
1605-1625: Move iterators in loop invalidate source after first iteration.Lines 1611-1612 use
std::make_move_iteratorto assign fromqos_listinside a loop over partitions. After the first iteration, elements inqos_listare moved-from and invalid. Subsequent partitions will receive empty/invalid QoS lists, corrupting user data.🔎 Proposed fix - copy instead of move in the loop
if (partition.empty()) { // Set the qos of all partition for (auto& [par, pair] : user->account_to_attrs_map[account_name].allowed_partition_qos_map) { if (!ranges::contains(qos_list, pair.first)) pair.first = qos_list.empty() ? "" : default_qos; - pair.second.assign(std::make_move_iterator(qos_list.begin()), - std::make_move_iterator(qos_list.end())); + pair.second.assign(qos_list.begin(), qos_list.end()); } } else { // Set the qos of a specified partition auto iter = user->account_to_attrs_map[account_name].allowed_partition_qos_map.find( partition); if (!ranges::contains(qos_list, iter->second.first)) iter->second.first = qos_list.empty() ? "" : default_qos; - iter->second.second.assign(std::make_move_iterator(qos_list.begin()), - std::make_move_iterator(qos_list.end())); + iter->second.second.assign(qos_list.begin(), qos_list.end()); }
1052-1069: Validation continues after conversion error, potentially using uninitialized value.When
ConvertStringToInt64fails (line 1057-1061), an error is added to the list but execution continues. At line 1064,value_number(which is uninitialized or contains garbage on failure) is used inCheckIfTimeLimitSecIsValid. This could lead to undefined behavior.🔎 Proposed fix
if (!ok) { rich_error_list.emplace_back(std::unexpected{ FormatRichErr(CraneErrCode::ERR_CONVERT_TO_INTEGER, value)}); + continue; // Skip this operation and proceed to next } if (item == Qos::FieldStringOfMaxTimeLimitPerTask() &&This ensures
value_numberis never accessed when conversion fails.
1672-1682: Erasing from wrong map - will corrupt user data.Line 1680 attempts to erase
partitionfromaccount_to_attrs_map, but this map is keyed by account names, not partition names. This either has no effect (partition name doesn't exist as key) or corrupts the account_to_attrs_map. The code should erase fromallowed_partition_qos_mapwithin the specific account.🔎 Proposed fix
CraneExpectedRich<void> AccountManager::CheckAndDeleteUserAllowedPartitionNoLock_( User* user, const std::string& account, const std::string& partition) { if (!user->account_to_attrs_map.at(account) .allowed_partition_qos_map.contains(partition)) return std::unexpected{ FormatRichErr(CraneErrCode::ERR_PARTITION_MISSING, partition)}; // modify the user - user->account_to_attrs_map.erase(partition); + user->account_to_attrs_map[account].allowed_partition_qos_map.erase(partition); return {}; }
| } | ||
| *log += fmt::format("Set: qos_list: {}\n", fmt::join(qos_list, ",")); | ||
| } else { | ||
| for (const auto& value : operation.value_list()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add bounds check before accessing value_list()[0].
Multiple locations in this function access operation.value_list()[0] without first verifying that the list is non-empty (lines 579, 623, 635). A malformed request with an empty value list will cause undefined behavior.
🔎 Proposed fix
Add a check at the start of the value loop or before each access:
} else {
for (const auto& value : operation.value_list()) {
+ if (operation.value_list().empty()) {
+ rich_error_list.emplace_back(std::unexpected{
+ FormatRichErr(CraneErrCode::ERR_INVALID_PARAM, "")});
+ break;
+ }
switch (operation.type()) {Or add validation before the single-value cases.
Also applies to: 623-632, 635-645
🤖 Prompt for AI Agents
In src/CraneCtld/AccountManager.cpp around lines 579, 623 and 635, the code
accesses operation.value_list()[0] without ensuring value_list is non-empty; add
a bounds check (e.g., if (operation.value_list_size() == 0) { /* return
error/log and continue/abort */ }) before the loop or immediately before each
single-value access so you never index [0] on an empty list, and handle the
malformed-request case consistently (log and return an appropriate error or skip
processing).
| case crane::grpc::ModifyField::AdminLevel: { | ||
| auto rich_result = CheckAndModifyUserAdminLevelNoLock_( | ||
| op_user, res_user, operation.value_list()[0]); | ||
|
|
||
| const Account* account_ptr = GetExistedAccountInfoNoLock_(actual_account); | ||
| if (!account_ptr) | ||
| return std::unexpected( | ||
| FormatRichErr(CraneErrCode::ERR_INVALID_ACCOUNT, "")); | ||
| if (!rich_result) { | ||
| rich_error_list.emplace_back(rich_result); | ||
| } else { | ||
| *log += fmt::format("admin_level: {}\n", operation.value_list()[0]); | ||
| } | ||
|
|
||
| auto rich_result = | ||
| CheckSetUserAllowedPartitionNoLock_(account_ptr, partition_list); | ||
| if (!rich_result) return rich_result; | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unchecked access to value_list()[0] may cause undefined behavior.
Multiple locations access operation.value_list()[0] without verifying the list is non-empty (lines 848, 853, 906, 922, 935). If a malformed request arrives with an empty value list, this will cause undefined behavior.
🔎 Proposed fix
Add bounds checking before accessing the first element:
case crane::grpc::ModifyField::AdminLevel: {
+ if (operation.value_list().empty()) {
+ rich_error_list.emplace_back(std::unexpected{
+ FormatRichErr(CraneErrCode::ERR_INVALID_PARAM, "")});
+ break;
+ }
auto rich_result = CheckAndModifyUserAdminLevelNoLock_(
op_user, res_user, operation.value_list()[0]);Apply similar checks at lines 906, 922, and 935.
Also applies to: 919-931, 933-937
🤖 Prompt for AI Agents
In src/CraneCtld/AccountManager.cpp around lines 846-856 (also review the
similar sites around 906, 919-931, 922 and 935), the code dereferences
operation.value_list()[0] without checking the list size; add a guard that
verifies !operation.value_list().empty() before accessing [0], and if empty
populate rich_error_list (or return/handle an appropriate error) and skip the
modification logic so no out-of-bounds access occurs; apply the same pattern to
the other referenced locations.
| for (const auto& operation : operations) { | ||
| auto value = operation.value_list()[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing bounds check on value_list()[0] throughout QoS modification.
Lines 1053 and 1079 access operation.value_list()[0] without verifying the list is non-empty. A malformed request will cause undefined behavior.
🔎 Proposed fix
Add validation at the start of the operation loop:
for (const auto& operation : operations) {
+ if (operation.value_list().empty()) {
+ rich_error_list.emplace_back(std::unexpected{
+ FormatRichErr(CraneErrCode::ERR_INVALID_PARAM, "")});
+ continue;
+ }
auto value = operation.value_list()[0];Also applies to: 1078-1118
🤖 Prompt for AI Agents
In src/CraneCtld/AccountManager.cpp around lines 1052-1053 and throughout the
QoS modification loop (also covering 1078-1118), operations access
operation.value_list()[0] without verifying the list is non-empty; add an
explicit bounds check at the start of the loop (e.g., if
(operation.value_list().empty()) { log/error and skip/return appropriate error
}) before any access, and apply the same guard to every other place that uses
value_list()[0] in this block so malformed requests do not cause undefined
behavior.
fixes #700
Summary by CodeRabbit
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.