-
-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
We want to automatically run cbioportal-mcp-qa evals as part of our CI pipeline to continuously validate application quality, correctness, and other metrics.
This requires setting up a new GitHub Actions workflow and registering external service accounts needed by the evals.
Implementation Details
-
GitHub Actions Workflow
- Create a new CI workflow file (e.g.,
.github/workflows/run-evals.yml). - Trigger on pull requests and main branch merges.
- Run the
cbioportal-mcp-qaeval suite using existing tooling/scripts. - Include configurable parameters for environment (staging/production), database connection, etc.
- Create a new CI workflow file (e.g.,
-
Service Setup
- Arize Phoenix
- Register a cBioPortal account with Arize Phoenix.
- Configure credentials/secrets in GitHub (
ARIZE_API_KEY, etc.). - Ensure eval results are reported to Arize for tracking and monitoring.
- Claude via Amazon Bedrock
- Register a Claude API token (Anthropic’s Claude through Amazon Bedrock).
- Add it to repository secrets (
BEDROCK_CLAUDE_TOKEN). - Configure the evals to optionally leverage Claude for evaluation or summarization tasks.
- Arize Phoenix
-
Security
- Store all credentials in GitHub Secrets.
- Document how to rotate keys and tokens safely.
Acceptance Criteria
- New GitHub Actions workflow runs
cbioportal-mcp-qaevals automatically on CI. - Arize Phoenix account registered and integrated for result tracking.
- Claude (via Amazon Bedrock) token registered and configured.
- Evals produce visible pass/fail summaries in CI results (not sure how easy is that).
Metadata
Metadata
Assignees
Labels
No labels