Skip to content

Conversation

@rob-1019
Copy link
Contributor

@rob-1019 rob-1019 commented Dec 17, 2025

Extends CreateKafkaTopicsStep to detect and automatically increase partition counts for existing topics that have fewer partitions than configured.

Previously, the upgrade step only created new topics and skipped existing ones. Now it checks existing topics and increases their partition count if they're under-provisioned. The step fails if partition count checks fail, ensuring operators are alerted to configuration issues. Note that Kafka does not support reducing partition counts, so topics with more partitions than configured are left unchanged with a warning.

Changes:

  • Add getCurrentPartitionCount() method to check existing partition counts

  • Track partitionsToIncrease map for topics that need upsizing

  • Track failedTopics list and throw exception if any checks fail

  • Add comprehensive test coverage for upsizing and failure scenarios

  • Update kafka-config.md to document automatic partition upsizing behavior

  • The PR conforms to DataHub's Contributing Guideline (particularly PR Title Format)

  • Links to related issues (if applicable)

  • Tests for the changes have been added/updated (if applicable)

  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.

  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Extends CreateKafkaTopicsStep to detect and automatically increase partition
counts for existing topics that have fewer partitions than configured.

Previously, the upgrade step only created new topics and skipped existing
ones. Now it checks existing topics and increases their partition count if
they're under-provisioned. The step fails if partition count checks fail,
ensuring operators are alerted to configuration issues. Note that Kafka does
not support reducing partition counts, so topics with more partitions than
configured are left unchanged with a warning.

Changes:
- Add getCurrentPartitionCount() method to check existing partition counts
- Track partitionsToIncrease map for topics that need upsizing
- Track failedTopics list and throw exception if any checks fail
- Add comprehensive test coverage for upsizing and failure scenarios
- Update kafka-config.md to document automatic partition upsizing behavior
@github-actions github-actions bot added docs Issues and Improvements to docs community-contribution PR or Issue raised by member(s) of DataHub Community labels Dec 17, 2025
… auto-upsize

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@codecov
Copy link

codecov bot commented Dec 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

desiredPartitions);
topicsToSkip.add(topicName);
} else {
log.info(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logging is not required as this a default behavior before this change also

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned this up a bit. PTAL.

@datahub-cyborg datahub-cyborg bot added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Dec 19, 2025
+ "These topics may not have the correct partition configuration.",
failedTopics.size(), failedTopics);
log.error(errorMessage);
throw new RuntimeException(errorMessage);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should not throw exception here and log the warning is sufficient. It should not be blocker for topic creation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel pretty strongly that failure to configure these items during a setup job should be fatal if the user has this functionality enabled. Same for topic creation and most other things during setup jobs do - its better to fail loudly. Failures may get auto-triaged by restart attempts, quiet ignores can't be.

if (existingTopics.contains(topicName)) {
log.info("Topic {} already exists - skipping creation", topicName);
topicsToSkip.add(topicName);
// For existing topics, check if partition count needs to be increased
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this new block can have its own separate function for code readability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look and let me know if that doesn't address core concern here. Made the log messages a bit more consistant to ease grepping for the various categories of outcome.

@deepgarg760
Copy link
Collaborator

IMO, this should be controlled by feature flag like: DATAHUB_AUTO_INCREASE_PARTITIONS

topicsToSkip.add(topicName);
// For existing topics, check if partition count needs to be increased
try {
int currentPartitions = getCurrentPartitionCount(adminClient, topicName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing single API call for each existing, can this be batch. The goal is to have single API call for all existing topics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Automatically increases partition counts for existing Kafka topics when
configured values exceed current counts during upgrades. Adds
DATAHUB_AUTO_INCREASE_PARTITIONS flag for operator control.

Key changes:
- Add DATAHUB_AUTO_INCREASE_PARTITIONS environment variable (defaults to true)
- Batch describe topics API call for efficiency (1 call vs N calls)
- Extract fetchPartitionCountsForExistingTopics method for readability
- Improve logging with consistent "Checking kafka topic" format
- Change partition reduction log from WARN to ERROR level
- Add comprehensive test coverage for auto-increase scenarios

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@datahub-cyborg datahub-cyborg bot added needs-review Label for PRs that need review from a maintainer. and removed pending-submitter-response Issue/request has been reviewed but requires a response from the submitter labels Dec 19, 2025
@rob-1019
Copy link
Contributor Author

IMO, this should be controlled by feature flag like: DATAHUB_AUTO_INCREASE_PARTITIONS

Excellent suggestion. Done. Overloading that was naughty.

Add new kafka.setup.autoIncreasePartitions property to the non-sensitive
properties list in PropertiesCollectorConfigurationTest to fix test failure.

This property controls automatic partition increases for Kafka topics and
does not contain sensitive data.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@codecov
Copy link

codecov bot commented Dec 19, 2025

Bundle Report

Changes will increase total bundle size by 5.35kB (0.02%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 31.28MB 5.35kB (0.02%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js 6.21kB 19.25MB 0.03%
assets/index-*.css -857 bytes 608.59kB -0.14%

Update error message in CreateKafkaTopicsStep to accurately reflect that failedTopics tracks all types of topic failures (creation, configuration, partition count checks), not just partition count verification.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution PR or Issue raised by member(s) of DataHub Community docs Issues and Improvements to docs needs-review Label for PRs that need review from a maintainer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants