Skip to content

Migrate SDKs to centralized evaluator testkit and remove duplicate evaluation tests #366

@aepfli

Description

@aepfli

Overview

PR #344 introduced the flagd-api-testkit with a dedicated evaluator test suite in evaluator/gherkin/.
This issue tracks the migration of all SDKs to use this new testkit and eliminates duplicated evaluation
tests currently scattered across individual SDK repositories.

Current State

Goals

  • Migrate all SDKs to use the centralized evaluator testkit from flagd-testbed
  • Remove duplicate evaluation test definitions from each SDK
  • Keep only SDK-specific integration/provider tests in each repository
  • Establish clear boundaries: evaluation tests (testkit) vs. integration tests (SDK repo)

Required Changes Per SDK

For each SDK (Java, Go, JS, .NET, Python, etc.):

  • Remove duplicate Gherkin test files related to evaluation
  • Remove unnecessary step definitions for evaluation scenarios
  • Update test runner to pull evaluator tests from flagd-testbed via git submodule
  • Update CI/CD to run both evaluator testkit and integration-only tests
  • Document the new test structure in SDK README

Example Structure

SDK Repository
├── tests/integration/           # Only provider/connection/config tests
│   └── gherkin/
├── flagd-testbed/              # Git submodule
│   └── evaluator/gherkin/      # Shared evaluator tests (via testkit)

Benefits

  • ✅ Single source of truth for evaluator tests
  • ✅ Faster CI/CD (evaluator tests run without Docker where possible)
  • ✅ Reduced maintenance burden
  • ✅ Consistent evaluation behavior across all SDKs

Out of Scope (for this issue)

  • Creating new features or test scenarios
  • Changes to the evaluator testkit itself (separate issues/PRs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions