Migrate SDKs to centralized evaluator testkit and remove duplicate evaluation tests

## Overview
PR #344 introduced the `flagd-api-testkit` with a dedicated evaluator test suite in `evaluator/gherkin/`.
This issue tracks the migration of all SDKs to use this new testkit and eliminates duplicated evaluation 
tests currently scattered across individual SDK repositories.

## Current State
- ✅ Evaluator testkit created (see PR #344)
- ❌ SDKs still running duplicate evaluation tests locally
- ❌ Inconsistent test organization across SDKs

## Goals
- Migrate all SDKs to use the centralized evaluator testkit from flagd-testbed
- Remove duplicate evaluation test definitions from each SDK
- Keep only SDK-specific integration/provider tests in each repository
- Establish clear boundaries: evaluation tests (testkit) vs. integration tests (SDK repo)

## Required Changes Per SDK
For each SDK (Java, Go, JS, .NET, Python, etc.):
- [ ] Remove duplicate Gherkin test files related to evaluation
- [ ] Remove unnecessary step definitions for evaluation scenarios
- [ ] Update test runner to pull evaluator tests from flagd-testbed via git submodule
- [ ] Update CI/CD to run both evaluator testkit and integration-only tests
- [ ] Document the new test structure in SDK README

## Example Structure
```
SDK Repository
├── tests/integration/           # Only provider/connection/config tests
│   └── gherkin/
├── flagd-testbed/              # Git submodule
│   └── evaluator/gherkin/      # Shared evaluator tests (via testkit)
```

## Benefits
- ✅ Single source of truth for evaluator tests
- ✅ Faster CI/CD (evaluator tests run without Docker where possible)
- ✅ Reduced maintenance burden
- ✅ Consistent evaluation behavior across all SDKs

## Out of Scope (for this issue)
- Creating new features or test scenarios
- Changes to the evaluator testkit itself (separate issues/PRs)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate SDKs to centralized evaluator testkit and remove duplicate evaluation tests #366

Overview

Current State

Goals

Required Changes Per SDK

Example Structure

Benefits

Out of Scope (for this issue)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Migrate SDKs to centralized evaluator testkit and remove duplicate evaluation tests #366

Description

Overview

Current State

Goals

Required Changes Per SDK

Example Structure

Benefits

Out of Scope (for this issue)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions