-
Notifications
You must be signed in to change notification settings - Fork 2.9k
fix(registry): make reward manager registration idempotent #4725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(registry): make reward manager registration idempotent #4725
Conversation
In distributed environments like Ray, modules may be imported multiple times across workers. This caused `ValueError` when the same reward manager class was registered more than once, even though the classes were functionally identical. The fix compares class identifiers (module + qualname) instead of class objects directly. Re-registering the same class (by qualified name) is now silently accepted, while registering different classes with the same name still raises an error. - Added `_get_class_identifier()` helper function - Updated `register()` decorator with idempotent behavior - Added test for idempotent registration Fixes volcengine#4718 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request makes the reward manager registration idempotent, which is a great improvement for distributed environments like Ray. The use of a class identifier based on module and qualname is a solid approach to handle cases where the same class is imported multiple times. The added test case correctly verifies the idempotent behavior.
I've found one potential issue in the implementation of the idempotent registration logic. When re-registering the same class, the registry is not updated with the new class object, which could lead to using stale objects if a module is reloaded. I've provided a suggestion to simplify the logic and make it more robust by always updating the registry. This change would improve correctness without affecting the passing tests.
| if name in REWARD_MANAGER_REGISTRY: | ||
| existing_cls = REWARD_MANAGER_REGISTRY[name] | ||
| existing_id = _get_class_identifier(existing_cls) | ||
| new_id = _get_class_identifier(cls) | ||
| if existing_id != new_id: | ||
| raise ValueError( | ||
| f"Reward manager '{name}' has already been registered with a different class: " | ||
| f"{existing_id} vs {new_id}" | ||
| ) | ||
| # Same class (by qualname) - idempotent registration, skip silently | ||
| return cls | ||
| REWARD_MANAGER_REGISTRY[name] = cls | ||
| return cls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation for idempotent registration has a potential issue. In the case of re-registering the same class (where existing_id == new_id), the function returns early without updating the REWARD_MANAGER_REGISTRY.
If a module is reloaded (e.g., during development or with importlib.reload), a new class object is created. While it has the same identifier, it's a different object in memory. The registry would continue to hold a reference to the old, stale class object, which could lead to unexpected behavior.
It's safer to always update the registry with the new class object. This simplifies the logic and makes it more robust against module reloads.
I suggest refactoring the decorator function to check for conflicts first, and then perform the registration/update.
if name in REWARD_MANAGER_REGISTRY:
existing_id = _get_class_identifier(REWARD_MANAGER_REGISTRY[name])
new_id = _get_class_identifier(cls)
if existing_id != new_id:
raise ValueError(
f"Reward manager '{name}' has already been registered with a different class: "
f"{existing_id} vs {new_id}"
)
REWARD_MANAGER_REGISTRY[name] = cls
return cls
Summary
In distributed environments like Ray, modules may be imported multiple times across workers. This caused
ValueErrorwhen the same reward manager class was registered more than once, even though the classes were functionally identical.Changes
_get_class_identifier()helper function that creates a unique identifier based onmodule + qualnameregister()decorator to compare class identifiers instead of class objectsRoot Cause
The original code compared class objects directly:
In Ray's distributed environment, the same class imported in different workers has different object ids, causing this comparison to fail even for identical classes.
Test Plan
test_idempotent_registration_same_classtestFixes #4718
🤖 Generated with Claude Code