Skip to content

[TT-15560] added batchbytes configs and unit testing for kafka#889

Merged
andrei-tyk merged 9 commits intomasterfrom
TT-15560-Add-batchbytes-config-for-Kafka-pump
Oct 13, 2025
Merged

[TT-15560] added batchbytes configs and unit testing for kafka#889
andrei-tyk merged 9 commits intomasterfrom
TT-15560-Add-batchbytes-config-for-Kafka-pump

Conversation

@LLe27
Copy link
Copy Markdown
Contributor

@LLe27 LLe27 commented Aug 19, 2025

Description

Added the BatchBytes configuration for the Kafka writerConfig.

Related Issue

TT-15560

Motivation and Context

How This Has Been Tested

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • Make sure you are requesting to pull a topic/feature/bugfix branch (right side). If pulling from your own
    fork, don't request your master!
  • Make sure you are making a pull request against the master branch (left side). Also, you should start
    your branch off our latest master.
  • My change requires a change to the documentation.
    • If you've changed APIs, describe what needs to be updated in the documentation.
  • I have updated the documentation accordingly.
  • Modules and vendor dependencies have been updated; run go mod tidy && go mod vendor
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • Check your code additions will not fail linting checks:
    • go fmt -s
    • go vet

@LLe27 LLe27 changed the title added batchbytes configs and unit testing for kafka [TT-15560] added batchbytes configs and unit testing for kafka Aug 25, 2025
@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented Oct 8, 2025

🔍 Code Analysis Results

This PR introduces a new configuration option, BatchBytes, for the Kafka pump. This allows users to control the maximum size of a message batch sent to Kafka, which can be used for performance tuning. The change includes adding the configuration field, logic to handle it during pump initialization, comprehensive unit tests for the new functionality, and updating the documentation.

Files Changed Analysis

  • README.md: Updated to document the new batch_bytes configuration option for the Kafka pump, including examples for both JSON configuration and environment variables.
  • pumps/kafka.go: The core logic change is here. A BatchBytes field was added to the KafkaConf struct. The Init function was updated to apply this value to the Kafka writer configuration, with a check to prevent negative values.
  • pumps/kafka_test.go: A new test file was added, containing a comprehensive suite of unit tests for the BatchBytes feature. The tests cover various scenarios, including valid values (custom, zero, large), handling of negative values, environment variable overrides, and ensuring no regressions in other configurations. The addition of 227 lines of tests demonstrates a focus on quality and correctness.

Architecture & Impact Assessment

What this PR accomplishes

This PR enhances the Kafka pump by exposing the BatchBytes configuration of the underlying kafka-go library. This gives operators finer control over the producer's batching behavior, allowing them to tune for higher throughput or lower memory usage depending on their workload.

Key technical changes introduced

  1. Configuration: A new integer field BatchBytes is added to the KafkaConf struct in pumps/kafka.go.
  2. Initialization Logic: The KafkaPump.Init method now reads the BatchBytes value. It includes validation to log an error if the value is negative and proceeds with the default (0), which lets the kafka-go library use its default (1MB).
  3. Testing: A new test file pumps/kafka_test.go provides thorough unit test coverage for the new configuration.
  4. Documentation: The README.md is updated to reflect the new configuration option.

Affected system components

  • Kafka Pump: The primary component affected is the Kafka pump, specifically its configuration and initialization logic.
  • User Configuration: Users of the Tyk Pump who utilize the Kafka pump will need to be aware of this new optional setting.

Configuration Flow Diagram

graph TD
    subgraph "User Configuration"
        A["pump.conf (JSON) <br> \"batch_bytes\": 1048576"]
        B["Environment Variable <br> TYK_PMP_PUMPS_KAFKA_META_BATCHBYTES"]
    end

    subgraph "Tyk Pump Initialization"
        C["KafkaPump.Init()"]
        D["kafkaConf.BatchBytes"]
        E{Validation <br> if BatchBytes < 0}
        F["writerConfig.BatchBytes = 0 <br> (kafka-go default)"]
        G["writerConfig.BatchBytes = kafkaConf.BatchBytes"]
    end

    subgraph "Kafka Producer"
        H["kafka.WriterConfig"]
        I["kafka.Writer"]
    end

    B -- Overrides --> A
    A --> C
    C --> D
    D --> E
    E -- Yes --> F
    E -- No --> G
    F --> H
    G --> H
    H --> I
Loading

Scope Discovery & Context Expansion

The changes are well-contained within the Kafka pump implementation (pumps/kafka.go) and its corresponding tests. The impact is limited to users of this specific pump. The addition of BatchBytes follows the existing pattern for other configuration parameters in KafkaConf, being configurable via both a configuration file and environment variables.

The new unit tests in pumps/kafka_test.go are a significant part of this PR and provide confidence in the change. They cover not only the happy path but also edge cases like zero, negative, and large values, as well as environment variable overrides. This testing strategy reduces the risk of regressions and ensures the feature works as documented.

A potential point of discussion is the handling of invalid (negative) input. The current implementation logs an error and uses a default value. An alternative approach would be to "fail fast" by returning an error from Init(), which would prevent the pump from starting with a misconfiguration. This is a design choice worth considering, as silently correcting invalid configuration can sometimes hide problems.

Metadata
  • Review Effort: 2 / 5
  • Primary Label: enhancement

Powered by Visor from Probelabs

Last updated: 2025-10-13T09:03:24.463Z | Triggered by: synchronize | Commit: 02c34db

@probelabs
Copy link
Copy Markdown
Contributor

probelabs Bot commented Oct 8, 2025

🔍 Code Analysis Results

Security Issues (2)

Severity Location Issue
🟡 Warning pumps/kafka.go:185-189
The `batch_bytes` configuration parameter lacks an upper limit. A malicious or misconfigured user could provide an extremely large value, causing the application to allocate excessive memory for the message batch. This could lead to an Out-of-Memory (OOM) error, resulting in a Denial of Service (DoS).
💡 SuggestionImplement a validation check to enforce a reasonable maximum value for `batch_bytes`. If the configured value exceeds this limit, the pump should either cap the value at the maximum and log a warning, or return an error during initialization to prevent starting with a potentially dangerous configuration.
🟡 Warning pumps/kafka.go:186-187
The application handles an invalid negative `batch_bytes` configuration by logging an error and then proceeding with a default value (0). This approach can mask configuration issues, as the error may be missed in logs, leading the service to run with unintended performance characteristics. A better practice for invalid configuration is to fail fast.
💡 SuggestionInstead of logging and continuing, return an error from the `Init` function when `k.kafkaConf.BatchBytes` is negative. This ensures that the application will not start in a misconfigured state, making the problem immediately apparent to the operator.
🔧 Suggested Fix
return fmt.Errorf("the config batch_bytes cannot be negative, but was set to %d", k.kafkaConf.BatchBytes)

Performance Issues (1)

Severity Location Issue
🟡 Warning pumps/kafka.go:185-189
The `batch_bytes` configuration parameter lacks an upper limit. A malicious or misconfigured user could provide an extremely large value, causing the application to allocate excessive memory for the message batch. This could lead to an Out-of-Memory (OOM) error, resulting in a Denial of Service (DoS).
💡 SuggestionImplement a validation check to enforce a reasonable maximum value for `batch_bytes`. If the configured value exceeds this limit, the pump should either cap the value at the maximum and log a warning, or return an error during initialization to prevent starting with a potentially dangerous configuration.

Quality Issues (2)

Severity Location Issue
🟡 Warning pumps/kafka.go:186-189
The application handles an invalid negative `batch_bytes` configuration by logging an error and then proceeding with a default value (0). This approach can mask configuration issues, as the error may be missed in logs, leading the service to run with unintended performance characteristics. A better practice for invalid configuration is to fail fast.
💡 SuggestionInstead of logging and continuing, return an error from the `Init` function when `k.kafkaConf.BatchBytes` is negative. This ensures that the application will not start in a misconfigured state, making the problem immediately apparent to the operator.
🔧 Suggested Fix
if k.kafkaConf.BatchBytes < 0 {
    return fmt.Errorf("the config batch_bytes cannot be negative, but was set to %d", k.kafkaConf.BatchBytes)
}
k.writerConfig.BatchBytes = k.kafkaConf.BatchBytes
🟡 Warning pumps/kafka.go:188-189
The `batch_bytes` configuration parameter lacks an upper limit. A malicious or misconfigured user could provide an extremely large value, causing the application to allocate excessive memory for the message batch. This could lead to an Out-of-Memory (OOM) error, resulting in a Denial of Service (DoS).
💡 SuggestionImplement a validation check to enforce a reasonable maximum value for `batch_bytes`. If the configured value exceeds this limit, the pump should either cap the value at the maximum and log a warning, or return an error during initialization to prevent starting with a potentially dangerous configuration.

Style Issues (1)

Severity Location Issue
🟡 Warning pumps/kafka.go:186-189
The application handles an invalid negative `batch_bytes` configuration by logging an error and then proceeding with a default value (0). This approach can mask configuration issues, as the error may be missed in logs, leading the service to run with unintended performance characteristics. A better practice for invalid configuration is to fail fast.
💡 SuggestionInstead of logging and continuing, return an error from the `Init` function when `k.kafkaConf.BatchBytes` is negative. This ensures that the application will not start in a misconfigured state, making the problem immediately apparent to the operator.
🔧 Suggested Fix
if k.kafkaConf.BatchBytes < 0 {
    return fmt.Errorf("the config batch_bytes cannot be negative, but was set to %d", k.kafkaConf.BatchBytes)
}
k.writerConfig.BatchBytes = k.kafkaConf.BatchBytes

Powered by Visor from Probelabs

Last updated: 2025-10-13T09:03:25.969Z | Triggered by: synchronize | Commit: 02c34db

@andrei-tyk andrei-tyk merged commit 33d9f48 into master Oct 13, 2025
88 of 91 checks passed
@andrei-tyk andrei-tyk deleted the TT-15560-Add-batchbytes-config-for-Kafka-pump branch October 13, 2025 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants