feat: system test framework proposal#111
Conversation
Introduces layered abstractions (ProxyScenario, FilterSpec, ProxyFixture, ProxyHandle) that separate test intent from deployment mechanism, enabling deployment-agnostic feature tests and test-first development. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
OperatorCapability now models the operator's externally observable state via generation-based reconciliation observation rather than the checksum annotation. Tests that assert on specific operator mechanisms (e.g. checksum change detection) observe resource state directly. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
OperatorCapability handles convergence and deployment agnosticism. Tests that assert on specific resource state (e.g. checksum annotations) use an injected KubernetesClient to observe resources directly, keeping the two concerns separate. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Three distinct test categories become separate modules with different compile-time dependencies: systemtest-feature (no K8s dependency), systemtest-operator (K8s client and CRD types), and systemtest-installer (one test per install method). All three are TCK-consumable. Installer is a public interface — the primary downstream extension point. CrdProxyFixture replaces OperatorProxyFixture to reflect that the fixture uses the CRD API, not the operator's internals. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Fixture/installer composition is extension-internal, driven by system properties. Installer tests use a single smoke test — the test does not vary between installers, CI provides the matrix. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
- Summary now mentions all four test categories and the Installer - "Three categories/modules" → "Four" throughout - Add systemtest-webhook to TCK module list with description - Feature test example: remove namespace parameter, use kafkaClient - Tags section: clarify module boundaries as primary separation - Rejected alternatives: @operator tag "ensures present", not "selects fixture" - Webhook section: rewrite as two-module story (installer + behaviour) - Affected projects: add systemtest-webhook module - Fix grammar: "An CrdProxyFixture" → "A CrdProxyFixture" Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
|
|
||
| 2. **No convergence contract**: the framework does not define when the proxy is ready. Each test class independently polls for readiness, with varying strategies and varying reliability. | ||
|
|
||
| 3. **Operator coupling**: every test implicitly requires the operator. Feature tests — which care only that a correctly-configured proxy is serving traffic — cannot run without the full operator installation. This conflates feature correctness with operator correctness and prevents fast local iteration. |
There was a problem hiding this comment.
The purpose of a System Tests is having all installed as you want on a real environment. We decided to go only for Kubernetes installation as we didn't have enough resources and time to cover everything. Ideally, we should have all kind of installations, such as bare-metal, Kubernetes...
Totally agree with system tests decoupling with the installation. The test cases should not care about how Kroxylicious has been installed, just check what is needed. Having a quicker installation will easy the Test-first development as we don't need to depend on the operator.
| - **Test-first development**: a developer writing a new filter can write a failing system test as the first commit of their feature branch, without reading framework documentation or asking QE for help. | ||
| - **Deployment-agnostic feature tests**: the same test runs against an operator-managed proxy, a manifest-managed proxy, or a Helm installation, with no changes to the test body. | ||
| - **Reliable convergence**: `proxyFixture.apply()` is a blocking call with a defined contract — when it returns, the proxy is serving the requested configuration. Manual polling disappears from test classes. | ||
| - **A TCK for downstream distributions**: downstream distributors implement `Installer` for their distribution and run upstream's test modules — feature, operator, installer, and webhook — without forking. |
There was a problem hiding this comment.
Sam's borrowing the Java parlance https://en.wikipedia.org/wiki/Technology_Compatibility_Kit
|
|
||
| A `ProxyScenario` describes the desired proxy configuration in deployment-agnostic terms; a `ProxyFixture` translates that into running infrastructure and blocks until convergence; the resulting `ProxyHandle` is a token of convergence that gates all subsequent interaction. An `Installer` — the primary downstream extension point — handles getting the operator, CRDs, and RBAC into the cluster independently of how proxies are deployed. | ||
|
|
||
| Feature tests become portable across deployment mechanisms — the same test runs against a CRD-deployed proxy, a manifest-managed proxy, a standalone process, or a downstream distribution — and cheap enough to write before the production code, as a specification. |
There was a problem hiding this comment.
Aside: I'd like to get Kroxylicious into Operatorhub soon. So having test that can are agnostic to their install methodology will be an enabler.
|
|
||
| Every feature test class has a private `deployXxx()` method that reimplements the same builder/template pattern against the operator's CRD types. Adding a new optional parameter (e.g. `ExperimentalKmsConfig`) requires touching every one of them. Timing workarounds are scattered across test classes with comments pointing at unresolved issues. The convergence question — "is the proxy actually serving the configuration I just applied?" — is answered by ad hoc polling in each test class rather than by a framework-level contract. | ||
|
|
||
| This setup cost has a second-order effect: system tests are written after features merge, delegated to QE because they are too expensive for a developer to include in a feature PR. The test framework is the bottleneck, not the assertions. |
There was a problem hiding this comment.
Let's describe the problem without the Red Hat terminology.
| These move to a `KubernetesClientCapability`: | ||
|
|
||
| ```java | ||
| interface KafkaClient { |
There was a problem hiding this comment.
One thing I dislike about the system tests is that all our client interactions are via a client CLI spawned in a separate process.
CLIs expose pretty basic operators: they boil down to produce or fetch from a topic. Consumer groups are possible but testing things like produce transactions + commit offsets is pretty much impossible. This will become a real impediment when we come to test things like Router implementations.
I think we need to be able to program the KafkaClient from the test so we are freer to express the fully richness of the Kafka.
A secondary issue is that relying on a CLI is slow: you pay the process startup time.
I have had an itch to scratch for a while which I think I've expressed to you both. Could we expose a common Java interface for Librdkafka and Sarama using Java Foreign Functions? We could even partially implement the Kafka Consumer, Producer and Admins client interfaces. The system tests could then be written in term of the Java Kafka API and exercise java/go/librdkafka using the same test.
Now we've got AI assistance I think writing a POC for this is in reach.
|
I like the ideas you are expressing in the proposal. I think it is the right way to go. |
Replace "delegated to QE" with community-neutral language and frame setup cost as a perceived barrier rather than a statement of fact. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
The same separation of concerns that motivates the fixture model applies to how tests interact with Kafka. Three independent axes: test intent (what), client driver (which implementation), and execution environment (where). The KafkaClient interface shows a richer target shape including transactions and consumer group management, enabled by in-process drivers (Java client, librdkafka/Sarama via FFI). CLI drivers implement produce/consume only. Produce and consume are the starting point; richer operations are the target. Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>
Summary
ProxyScenario) from deployment mechanism (ProxyFixture) with convergence gating (ProxyHandle)systemtest-feature,systemtest-operator,systemtest-webhook,systemtest-installerInstalleras the primary downstream extension point — downstream varies by installation method, not proxy deploymentTest plan
🤖 Generated with Claude Code