Skip to content

Build Evaluation Framework for the SDK Release Agent for E2E flow #14826

@lirenhe

Description

@lirenhe

Challenges

The SDK release agent is becoming a critical path for self-service releases. As the agent evolves, we need a reliable way to validate behavior.

Today, SH team has been asked to test for the SDK release agent manual which is difficult to test all cases and is unable to scale, there is interest in leveraging an eval approach for covering a scenario set more consistently as the agent changes

Goal

Build a test and evaluation framework for the SDK release agent that enables

  • Repeatable scenario-based testing
  • Regression detection for agent changes
  • Step evaluation and end-to-end flow evaluation

The Shanghai team can contribute real service-team release scenarios encountered in self-service releases and help convert them into reusable test cases for the scenario catalog.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions