Skip to content

Conversation

@jshearer
Copy link
Contributor

Just here to run the tests in CI, don't mind me

@jshearer jshearer force-pushed the dekaf/collection_reset_with_e2e_tests branch 3 times, most recently from 2aa4ced to 6490a0d Compare December 18, 2025 20:17
Dekaf previously required TLS and MSK IAM authentication for all upstream
Kafka connections, making local development and testing difficult. This adds
support for plaintext connections via URL scheme detection:

* `tcp://host:port` connects without TLS, `tls://host:port` uses TLS (default)
* `--upstream-no-auth` flag skips SASL authentication entirely
* `KafkaClientAuth::from_msk_region(None)` creates no-auth mode

Example local usage:
  dekaf --default-broker-urls tcp://localhost:29092 --upstream-no-auth ...
Run Dekaf e2e tests as separate step because `nexttest-run` messes with local stack state
* Make `local:data-plane` idempotent
* `ci:dekaf-e2e` now assumes `local:stack` etc are up rather than explicitly depending on it
* mise: log systemd output if failure
* mise: also log agent logs on failure
…ion reset

When a collection is reset or a materialization binding is backfilled, consumers
need to detect that their committed offsets are invalid. This maps the binding's
backfill counter to Kafka's leader epoch mechanism:

* Emit `leader_epoch` in Metadata and ListOffsets responses
* Validate consumer epoch in Fetch and ListOffsets, returning `FENCED_LEADER_EPOCH`
  for stale epochs and `UNKNOWN_LEADER_EPOCH` for future epochs
* Implement `OffsetForLeaderEpoch` API - returns offset 0 for old epochs (reset to
  beginning) and current high watermark for current epoch
* Append `-e{counter}` suffix to upstream topic names for offset isolation by epoch

Also isolate committed offsets by task name and clean up legacy offsets.
Previously, all topics sharing the same token would commit offsets to the same
upstream Kafka topic name, creating potential conflicts across tasks/tenants.

* Swap encryption nonce from token to task_name when epoch suffix is present
* Clean up oldoffsets after successful epoch-qualified commit, only after the new commit succeeds
This shows up sometimes, I believe attributed to eventual consistency in
the upstream Kafka brokers. It should be retried and not result in the
session crashing.
`--profile dekaf-e2e` instead

Also get rid of unneccesary snapshots
@jshearer jshearer force-pushed the dekaf/collection_reset_with_e2e_tests branch from 6490a0d to 8c53dd4 Compare December 19, 2025 16:06
…sions waiting on snapshot refresh

Is this a kludge? Ideally these sessions would just not disconnect at all... why are they?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants