-
Notifications
You must be signed in to change notification settings - Fork 765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Initial implementation for BaseVerifier #1597
base: master
Are you sure you want to change the base?
Conversation
then implementation then talk to other domain teams to unify the interface Target DDL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Wendong-Fan Looks promising! Left some comments~ (considering future work)
@Wendong-Fan nice work, but I think it could be improved. From my understanding, the verifiers will be used in two main ways:
In both cases, most of the complexity should probably be encapsulated in the environment class and a verifier should mostly just be a function that we can call with an extracted part of the LLM response (we may even need an extra class for extractors, as extraction can get quite complicated) and get a "correct", "incorrect" or "timeout" (to try again or discard the data point in RLVR). If we implement the way it is currently being done, calling the verifier will take a lot of code for defining everything that we will probably have to define for the environment and data filtering pipeline in the first place, leading to a lot of duplicated code. Further, I think it makes sense to have one verifier be instantiated in the environment and keeping state (as is currently implemented), which is especially crucial when setting up the verifier is non-trivial, hence a setup function should probably be added that does nothing, if not overridden, but can be used for starting the verifier backend and making sure there is a connection to the verifier if the verifier is an online service. The same thing is true for a tear down function to free processes. What do you guys think? @Wendong-Fan @lightaime Happy to discuss. |
Hey @hallerite, Did you think of it as something different or am I missing something? Your points about setup/teardown lifecycle and extraction logic are interesting - how do you see these fitting with domain-specific verification needs? |
RFC
📋 Problem
The CAMEL framework needs a robust, extensible verification system to validate and assess the quality of LLM responses across various task types. The verifiers module provides this critical functionality.
✔️ Goals
❌ NonGoals
💡 Solution
The verifiers module implements a base verification framework with the following key components:
BaseVerifier
: Abstract base class providing core verification infrastructureVerificationResult
: Structured output format containing:VerifierConfig
: Configuration management for:TaskType
enumerationPerformance Impact
max_parallel
setting🎨 Alternatives
No alternative solutions currently exist for this upgrade.
📃 Work Estimates
🗂️ Future Work
❓ FAQ
Q: How do I implement a custom verifier? A: Extend the
BaseVerifier
class and implement the_verify_implementation
method.Q: How does batch verification work? A: Use the
verify_batch
method with a list of responses. Concurrency is automatically managed.Q: What happens if verification fails? A: The system provides detailed error information and can retry based on configuration.
Q: How can I monitor verification performance? A: Use the
VerificationMetrics
class which tracks success rates, durations, and error counts.Q: Can I disable verification for certain scenarios? A: Yes, use
VerifierConfig
to enable/disable verification and configure behavior.Checklist
Go over all the following points, and put an
x
in all the boxes that apply.Fixes #issue-number
in the PR description (required)pyproject.toml
andpoetry.lock
If you are unsure about any of these, don't hesitate to ask. We are here to help!