Build Evaluation Framework for the SDK Release Agent for E2E flow

## Challenges

The SDK release agent is becoming a critical path for self-service releases.  As the agent evolves, we need a reliable way to validate behavior.

Today, SH team has been asked to test for the SDK release agent manual which is difficult to test all cases and is unable to scale, there is interest in leveraging an eval approach for covering a scenario set more consistently as the agent changes


## Goal
Build a test and evaluation framework for the SDK release agent that enables
- Repeatable scenario-based testing
- Regression detection for agent changes
- Step evaluation and end-to-end flow evaluation

The Shanghai team can contribute real service-team release scenarios encountered in self-service releases and help convert them into reusable test cases for the scenario catalog.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build Evaluation Framework for the SDK Release Agent for E2E flow #14826

Challenges

Goal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Build Evaluation Framework for the SDK Release Agent for E2E flow #14826

Description

Challenges

Goal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions