Skip to content

Commit 3fc92b8

Browse files
yurekamiclaude
andcommitted
fix(registry): make reward manager registration idempotent
In distributed environments like Ray, modules may be imported multiple times across workers. This caused `ValueError` when the same reward manager class was registered more than once, even though the classes were functionally identical. The fix compares class identifiers (module + qualname) instead of class objects directly. Re-registering the same class (by qualified name) is now silently accepted, while registering different classes with the same name still raises an error. - Added `_get_class_identifier()` helper function - Updated `register()` decorator with idempotent behavior - Added test for idempotent registration Fixes #4718 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
1 parent 56d18ec commit 3fc92b8

File tree

2 files changed

+44
-4
lines changed

2 files changed

+44
-4
lines changed

tests/workers/reward_manager/test_registry_on_cpu.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,3 +92,21 @@ def method(setup):
9292

9393
assert OriginalClass().method() == 42
9494
assert REWARD_MANAGER_REGISTRY["return_test"] == OriginalClass
95+
96+
97+
def test_idempotent_registration_same_class(setup):
98+
"""Test that registering the same class twice is idempotent (no error).
99+
100+
This is important for distributed environments like Ray where modules
101+
may be imported multiple times across workers.
102+
"""
103+
104+
@register("idempotent_test")
105+
class IdempotentManager:
106+
pass
107+
108+
# Register the same class again - should not raise
109+
decorated = register("idempotent_test")(IdempotentManager)
110+
111+
assert decorated == IdempotentManager
112+
assert REWARD_MANAGER_REGISTRY["idempotent_test"] == IdempotentManager

verl/workers/reward_manager/registry.py

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,19 +21,41 @@
2121
REWARD_MANAGER_REGISTRY: dict[str, type[AbstractRewardManager]] = {}
2222

2323

24+
def _get_class_identifier(cls: type) -> str:
25+
"""Get a unique identifier for a class based on its module and qualified name.
26+
27+
This is used instead of direct class comparison because in distributed environments
28+
(like Ray), the same class may be imported multiple times with different object ids.
29+
"""
30+
return f"{cls.__module__}.{cls.__qualname__}"
31+
32+
2433
def register(name: str) -> Callable[[type[AbstractRewardManager]], type[AbstractRewardManager]]:
2534
"""Decorator to register a reward manager class with a given name.
2635
2736
Args:
2837
name: `(str)`
2938
The name of the reward manager.
39+
40+
Note:
41+
Registration is idempotent - registering the same class multiple times
42+
(identified by module and qualname) will not raise an error. This is
43+
necessary for distributed environments like Ray where modules may be
44+
imported multiple times across workers.
3045
"""
3146

3247
def decorator(cls: type[AbstractRewardManager]) -> type[AbstractRewardManager]:
33-
if name in REWARD_MANAGER_REGISTRY and REWARD_MANAGER_REGISTRY[name] != cls:
34-
raise ValueError(
35-
f"Reward manager {name} has already been registered: {REWARD_MANAGER_REGISTRY[name]} vs {cls}"
36-
)
48+
if name in REWARD_MANAGER_REGISTRY:
49+
existing_cls = REWARD_MANAGER_REGISTRY[name]
50+
existing_id = _get_class_identifier(existing_cls)
51+
new_id = _get_class_identifier(cls)
52+
if existing_id != new_id:
53+
raise ValueError(
54+
f"Reward manager '{name}' has already been registered with a different class: "
55+
f"{existing_id} vs {new_id}"
56+
)
57+
# Same class (by qualname) - idempotent registration, skip silently
58+
return cls
3759
REWARD_MANAGER_REGISTRY[name] = cls
3860
return cls
3961

0 commit comments

Comments
 (0)