|
| 1 | +# Upgrading to Karafka 2.6 |
| 2 | + |
| 3 | +Karafka 2.6 is a significant release built around one central theme: preparing the internals for **Kafka Share Groups** (KIP-932). To support a fundamentally new group type alongside the existing consumer group model, a large portion of the processing, routing, connection, and instrumentation layers has been reorganized into consistent namespaces. This was the mandatory first step - the internal consistency work had to land before any Share Group functionality could be layered on top. |
| 4 | + |
| 5 | +Beyond the structural groundwork, 2.6 ships a redesigned Declarative Topics system, a new low-level partition offset query API, and dynamic worker pool scaling. |
| 6 | + |
| 7 | +!!! tip "Pro & Enterprise Upgrade Support" |
| 8 | + |
| 9 | + If you're gearing up to upgrade to the latest Karafka version and are a Pro or Enterprise user, remember you've got a dedicated lifeline! Reach out via the dedicated Slack channel for direct support to ensure everything has been covered. |
| 10 | + |
| 11 | +As always, please make sure you have upgraded to the most recent version of `2.5` before upgrading to `2.6`. |
| 12 | + |
| 13 | +Also, remember to read and apply our standard [upgrade procedures](Upgrades-Upgrading). |
| 14 | + |
| 15 | +## Breaking Changes |
| 16 | + |
| 17 | +### Pause Configuration Flat Methods Removed |
| 18 | + |
| 19 | +The flat pause configuration methods (`config.pause_timeout`, `config.pause_max_timeout`, `config.pause_with_exponential_backoff`) that were deprecated in Karafka 2.5.2 have been removed. Use the nested `config.pause.*` namespace instead: |
| 20 | + |
| 21 | +```ruby |
| 22 | +# Before (no longer works) |
| 23 | +config.pause_timeout = 1_000 |
| 24 | +config.pause_max_timeout = 30_000 |
| 25 | +config.pause_with_exponential_backoff = true |
| 26 | + |
| 27 | +# After |
| 28 | +config.pause.timeout = 1_000 |
| 29 | +config.pause.max_timeout = 30_000 |
| 30 | +config.pause.with_exponential_backoff = true |
| 31 | +``` |
| 32 | + |
| 33 | +## New Declarative Topics API |
| 34 | + |
| 35 | +The Declarative Topics system has been redesigned with a standalone `declaratives.draw` DSL that is independent of routing. This is the primary change that most applications will interact with when upgrading to 2.6. |
| 36 | + |
| 37 | +### Standalone DSL |
| 38 | + |
| 39 | +Topic declarations are now defined using `declaratives.draw` at the application level. This separates infrastructure concerns (topic configuration) from application concerns (consumer routing), and makes it possible to manage any Kafka topic - including topics your application does not consume: |
| 40 | + |
| 41 | +```ruby |
| 42 | +class KarafkaApp < Karafka::App |
| 43 | + declaratives.draw do |
| 44 | + topic :orders do |
| 45 | + partitions 6 |
| 46 | + replication_factor 3 |
| 47 | + config( |
| 48 | + 'retention.ms': 86_400_000, |
| 49 | + 'cleanup.policy': 'delete' |
| 50 | + ) |
| 51 | + end |
| 52 | + |
| 53 | + topic :events do |
| 54 | + partitions 10 |
| 55 | + replication_factor 3 |
| 56 | + end |
| 57 | + end |
| 58 | + |
| 59 | + routes.draw do |
| 60 | + topic :orders do |
| 61 | + consumer OrdersConsumer |
| 62 | + end |
| 63 | + end |
| 64 | +end |
| 65 | +``` |
| 66 | + |
| 67 | +If not specified, the defaults are `partitions: 1` and `replication_factor: 1`. |
| 68 | + |
| 69 | +### Defaults Support |
| 70 | + |
| 71 | +You can define default values that apply to all topics in the block. Topic-specific values override them: |
| 72 | + |
| 73 | +```ruby |
| 74 | +class KarafkaApp < Karafka::App |
| 75 | + declaratives.draw do |
| 76 | + defaults do |
| 77 | + partitions 5 |
| 78 | + replication_factor 3 |
| 79 | + config('retention.ms': 604_800_000) |
| 80 | + end |
| 81 | + |
| 82 | + topic :orders do |
| 83 | + partitions 10 # Overrides the default of 5 |
| 84 | + end |
| 85 | + |
| 86 | + topic :events |
| 87 | + # Inherits: 5 partitions, 3 replication_factor, 7-day retention |
| 88 | + end |
| 89 | +end |
| 90 | +``` |
| 91 | + |
| 92 | +### Multiple Draw Blocks |
| 93 | + |
| 94 | +You can call `declaratives.draw` more than once. Each call is additive and accumulates topic declarations. This is useful for splitting declarations across initializers or plugins: |
| 95 | + |
| 96 | +```ruby |
| 97 | +class KarafkaApp < Karafka::App |
| 98 | + declaratives.draw do |
| 99 | + topic :orders do |
| 100 | + partitions 6 |
| 101 | + end |
| 102 | + end |
| 103 | + |
| 104 | + declaratives.draw do |
| 105 | + topic :events do |
| 106 | + partitions 10 |
| 107 | + end |
| 108 | + end |
| 109 | +end |
| 110 | +``` |
| 111 | + |
| 112 | +### Managing Topics You Do Not Consume |
| 113 | + |
| 114 | +Because declarations are decoupled from routing, you can manage topics owned by other services or topics used only for producing: |
| 115 | + |
| 116 | +```ruby |
| 117 | +class KarafkaApp < Karafka::App |
| 118 | + declaratives.draw do |
| 119 | + topic :orders do |
| 120 | + partitions 6 |
| 121 | + replication_factor 3 |
| 122 | + end |
| 123 | + |
| 124 | + # Owned by a different service - managed here for infrastructure consistency |
| 125 | + topic :external_events do |
| 126 | + partitions 10 |
| 127 | + replication_factor 3 |
| 128 | + config('retention.ms': 604_800_000) |
| 129 | + end |
| 130 | + |
| 131 | + # Produce-only topic, no routing entry needed |
| 132 | + topic :audit_log do |
| 133 | + partitions 3 |
| 134 | + replication_factor 3 |
| 135 | + config('cleanup.policy': 'compact') |
| 136 | + end |
| 137 | + end |
| 138 | +end |
| 139 | +``` |
| 140 | + |
| 141 | +### Legacy Routing-Based Configuration is Deprecated |
| 142 | + |
| 143 | +!!! warning "Deprecation Notice" |
| 144 | + |
| 145 | + Defining topic configuration via the routing `#config` method is deprecated. It continues to work in 2.6 for backwards compatibility, but will be removed in a future major release. All new topic declarations should use the standalone `declaratives.draw` DSL. |
| 146 | + |
| 147 | +The routing-based `config()` approach still functions and populates the same shared repository. When a topic is declared in both places, the standalone `declaratives.draw` declaration takes precedence. |
| 148 | + |
| 149 | +To migrate, move your `config()` calls from routing into a `declaratives.draw` block: |
| 150 | + |
| 151 | +```ruby |
| 152 | +# Before (deprecated, still works) |
| 153 | +routes.draw do |
| 154 | + topic :orders do |
| 155 | + config(partitions: 6, replication_factor: 3, 'retention.ms': 86_400_000) |
| 156 | + consumer OrdersConsumer |
| 157 | + end |
| 158 | +end |
| 159 | + |
| 160 | +# After (recommended) |
| 161 | +declaratives.draw do |
| 162 | + topic :orders do |
| 163 | + partitions 6 |
| 164 | + replication_factor 3 |
| 165 | + config('retention.ms': 86_400_000) |
| 166 | + end |
| 167 | +end |
| 168 | + |
| 169 | +routes.draw do |
| 170 | + topic :orders do |
| 171 | + consumer OrdersConsumer |
| 172 | + end |
| 173 | +end |
| 174 | +``` |
| 175 | + |
| 176 | +## Internal Namespace Reorganization |
| 177 | + |
| 178 | +The largest set of changes in 2.6 is the reorganization of internal classes under `ConsumerGroups` namespaces across the processing, routing, connection, and instrumentation layers. This is **step one** of the work required to bring Kafka Share Groups (KIP-932) into Karafka - Share Groups need their own parallel set of strategies, coordinators, jobs, and callbacks, and that is only cleanly achievable once the existing consumer-group-specific code lives in a dedicated namespace. |
| 179 | + |
| 180 | +!!! info "No Impact on Documented Public APIs" |
| 181 | + |
| 182 | + All documented public APIs - consumer classes, routing DSL, configuration options, instrumentation event names and payloads - are unchanged. If you only use what is covered in the official documentation, you will **not** be affected by any of the moves below. |
| 183 | + |
| 184 | +!!! warning "Advanced Users: Internal Class References" |
| 185 | + |
| 186 | + If your code references internal Karafka classes directly by constant path (for example, to override a strategy, swap a coordinator, or hook into a jobs builder), those constants have moved. The changes are mechanical renames with no behavioral difference. Review the list below and update any direct references. |
| 187 | + |
| 188 | +The moved constants include (OSS): |
| 189 | + |
| 190 | +- `Karafka::Processing::Coordinator` → `Karafka::Processing::ConsumerGroups::Coordinator` |
| 191 | +- `Karafka::Processing::CoordinatorsBuffer` → `Karafka::Processing::ConsumerGroups::CoordinatorsBuffer` |
| 192 | +- `Karafka::Processing::Executor` → `Karafka::Processing::ConsumerGroups::Executor` |
| 193 | +- `Karafka::Processing::ExecutorsBuffer` → `Karafka::Processing::ConsumerGroups::ExecutorsBuffer` |
| 194 | +- `Karafka::Processing::Partitioner` → `Karafka::Processing::ConsumerGroups::Partitioner` |
| 195 | +- `Karafka::Processing::ExpansionsSelector` → `Karafka::Processing::ConsumerGroups::ExpansionsSelector` |
| 196 | +- `Karafka::Processing::Strategies::*` → `Karafka::Processing::ConsumerGroups::Strategies::*` |
| 197 | +- `Karafka::Processing::StrategySelector` → `Karafka::Processing::ConsumerGroups::StrategySelector` |
| 198 | +- `Karafka::Processing::Jobs::Consume` → `Karafka::Processing::ConsumerGroups::Jobs::Consume` |
| 199 | +- `Karafka::Processing::Jobs::Eofed` → `Karafka::Processing::ConsumerGroups::Jobs::Eofed` |
| 200 | +- `Karafka::Processing::Jobs::Revoked` → `Karafka::Processing::ConsumerGroups::Jobs::Revoked` |
| 201 | +- `Karafka::Processing::Jobs::Shutdown` → `Karafka::Processing::ConsumerGroups::Jobs::Shutdown` |
| 202 | +- `Karafka::Processing::Jobs::Idle` → `Karafka::Processing::ConsumerGroups::Jobs::Idle` |
| 203 | +- `Karafka::Processing::JobsBuilder` → `Karafka::Processing::ConsumerGroups::JobsBuilder` |
| 204 | +- `Karafka::Connection::RebalanceManager` → `Karafka::Connection::ConsumerGroups::RebalanceManager` |
| 205 | +- `Karafka::Instrumentation::Callbacks::Rebalance` → `Karafka::Instrumentation::Callbacks::ConsumerGroups::Rebalance` |
| 206 | +- `Karafka::Instrumentation::Callbacks::Error` → `Karafka::Instrumentation::Callbacks::ConsumerGroups::Error` |
| 207 | +- `Karafka::Instrumentation::Callbacks::Statistics` → `Karafka::Instrumentation::Callbacks::ConsumerGroups::Statistics` |
| 208 | +- `Karafka::Routing::Features::ActiveJob` → `Karafka::Routing::Features::ConsumerGroups::ActiveJob` |
| 209 | +- `Karafka::Routing::Features::DeadLetterQueue` → `Karafka::Routing::Features::ConsumerGroups::DeadLetterQueue` |
| 210 | +- `Karafka::Routing::Features::Eofed` → `Karafka::Routing::Features::ConsumerGroups::Eofed` |
| 211 | +- `Karafka::Routing::Features::ManualOffsetManagement` → `Karafka::Routing::Features::ConsumerGroups::ManualOffsetManagement` |
| 212 | + |
| 213 | +The following config internal settings have been nested under `config.internal.processing.consumer_groups`: |
| 214 | + |
| 215 | +- `config.internal.processing.coordinator_class` |
| 216 | +- `config.internal.processing.executor_class` |
| 217 | +- `config.internal.processing.partitioner_class` |
| 218 | +- `config.internal.processing.strategy_selector` |
| 219 | +- `config.internal.processing.expansions_selector` |
| 220 | +- `config.internal.processing.errors_tracker_class` |
| 221 | +- `config.internal.processing.jobs_builder` |
| 222 | + |
| 223 | +Shared settings (`jobs_queue_class`, `scheduler_class`, `worker_job_call_wrapper`) remain at the `config.internal.processing` level and are unaffected. |
| 224 | + |
| 225 | +### Routing Group Accessor Changes |
| 226 | + |
| 227 | +`Routing::Topic#group` and `Routing::SubscriptionGroup#group` have been introduced as polymorphic accessors. `#consumer_group` is kept as a backwards-compatible alias and will be retired in Karafka 3.0 once Share Groups land. Instrumentation payloads now emit parallel `group:` / `group_id:` keys alongside the existing `consumer_group:` / `consumer_group_id:` keys - the legacy keys remain present and unchanged. |
| 228 | + |
| 229 | +## New Admin API: Reading Partition Offsets |
| 230 | + |
| 231 | +`Karafka::Admin.read_partition_offsets` is a new low-level method for querying partition offsets without a consumer group. It supports `:earliest`, `:latest`, `:max_timestamp`, and millisecond timestamp specs across multiple topics and partitions in a single call. Pass `isolation_level: Karafka::Admin::IsolationLevels::READ_COMMITTED` to get the Last Stable Offset (LSO) instead of the high-watermark for accurate lag computation on transactionally-produced topics. |
| 232 | + |
| 233 | +See the [Admin API documentation](Infrastructure-Admin-API) for full usage details. |
| 234 | + |
| 235 | +## Dynamic Worker Pool Scaling |
| 236 | + |
| 237 | +The new `Processing::WorkersPool` adds support for runtime thread pool scaling via `#scale`. This enables Karafka to grow and shrink the pool of workers dynamically in response to load or configuration changes, with `worker.scaling.up` and `worker.scaling.down` instrumentation events emitted on each transition. |
| 238 | + |
| 239 | +## Ractors Deferred from 2.6 |
| 240 | + |
| 241 | +Ractor-based parallel deserialization was implemented and is functional, but has been intentionally excluded from this release. The reasoning is straightforward: 2.6 already contains a large volume of internal structural changes. Shipping Ractors on top of that would have made it significantly harder to reason about the source of any issues that surface after the release. Ractors will be introduced in a subsequent release once the 2.6 internals stabilize in production. |
| 242 | + |
| 243 | +## Performance Improvements |
| 244 | + |
| 245 | +Several internal admin operations that previously issued N sequential per-partition consumer calls have been replaced with batched admin calls: |
| 246 | + |
| 247 | +- `Admin::Topics#read_watermark_offsets` now issues two batch calls (`:earliest` and `:latest`) regardless of how many topics or partitions are queried, instead of N sequential calls. |
| 248 | +- The time-based offset resolution fallback in `Admin::ConsumerGroups#seek` now uses a single batch call. |
| 249 | +- Pro `Iterator::TplBuilder` negative offset resolution is now handled with three total calls regardless of partition count. |
| 250 | + |
| 251 | +## Summary of Actions Required |
| 252 | + |
| 253 | +For most applications, the upgrade from 2.5 to 2.6 requires only one action: |
| 254 | + |
| 255 | +- Update `config.pause_timeout`, `config.pause_max_timeout`, and `config.pause_with_exponential_backoff` to `config.pause.timeout`, `config.pause.max_timeout`, and `config.pause.with_exponential_backoff`. |
| 256 | + |
| 257 | +If your code references internal Karafka classes by constant path, review the namespace moves in the Internal Namespace Reorganization section above. |
| 258 | + |
| 259 | +Migrating from routing-based `config()` to `declaratives.draw` is encouraged but **not** required for 2.6 - the old API continues to work. |
0 commit comments