[FSTORE-1962] Automatically add feature group topics with "_onlinefs" sufix if users provide a topic that does not end with "onlinefs"#826
Conversation
There was a problem hiding this comment.
Pull request overview
Adds automatic enforcement of the _onlinefs Kafka topic suffix for online-enabled feature groups, appending it when missing and warning the user.
Changes:
- Introduces topic name validation during feature group metadata save to append
_onlinefswhen required. - Adds unit tests covering suffix auto-append, no-op when present, and
Nonehandling. - Updates Python docstrings to document the
_onlinefssuffix requirement and auto-append behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| python/hsfs/core/feature_group_engine.py | Adds _onlinefs suffix constant, validation/warning logic, and hooks validation into save path. |
| python/tests/core/test_feature_group_engine.py | Adds tests to ensure validation appends suffix + emits warning, and no warning for valid topics/None. |
| python/hsfs/feature_store.py | Updates feature group creation docstrings to document required suffix and auto-append behavior. |
| python/hsfs/feature_group.py | Updates constructor/property docstrings to document _onlinefs suffix requirement and auto-append behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bubriks
left a comment
There was a problem hiding this comment.
Not too related to this pr, but just FYI some users are asking for possibility to change fg topic after fg is created (right now only possibly by running sql commands, this would be new jira so not an issue).
| corrected = feature_group.topic_name + _ONLINE_TOPIC_NAME_SUFFIX | ||
| warnings.warn( | ||
| f"Topic name '{feature_group.topic_name}' does not end with " | ||
| f"required suffix '{_ONLINE_TOPIC_NAME_SUFFIX}'. The suffix has " | ||
| f"been added automatically. Topic name set to: '{corrected}'.", | ||
| util.FeatureGroupWarning, | ||
| stacklevel=2, | ||
| ) | ||
| feature_group.topic_name = corrected |
There was a problem hiding this comment.
by default onlinefs service consumes based on matching suffix, but it can be also made to consume from specific topics (without using _onlinefs suffix)
Not sure if used currently, but depending on the user it might be important
| import polars as pl | ||
| from hsfs.transformation_function import TransformationFunction | ||
|
|
||
| _ONLINE_TOPIC_NAME_SUFFIX = "_onlinefs" |
There was a problem hiding this comment.
right now user topics are generated like this: https://github.com/logicalclocks/hopsworks-ee/blob/ffb374fd90b5c98d65c2e12191f29351ad4be5c1/hopsworks-common/src/main/java/io/hops/hopsworks/common/hdfs/Utils.java#L191
basicaly backend decides what should be returned to the user, maybe same should be done here. (aka. do we want user provided topic to be explicit or implicit)
| and feature_group.topic_name is not None | ||
| and not feature_group.topic_name.endswith(_ONLINE_TOPIC_NAME_SUFFIX) | ||
| ): | ||
| corrected = feature_group.topic_name + _ONLINE_TOPIC_NAME_SUFFIX |
There was a problem hiding this comment.
if we correct maybe we should allow user to disable corrections? in case they want to use that exact name without suffix.
This PR automatically adds the '_onlinefs' prefix to topic provided to a feature group so that online fs can process it.
JIRA Issue: https://hopsworks.atlassian.net/browse/FSTORE-1962
Priority for Review: -
Related PRs: -
How Has This Been Tested?
Checklist For The Assigned Reviewer: