[FSTORE-1998] Add HQSLocalClient for in-process query execution with Arrow Flight f…#871
Open
jimdowling wants to merge 3 commits into
Open
[FSTORE-1998] Add HQSLocalClient for in-process query execution with Arrow Flight f…#871jimdowling wants to merge 3 commits into
jimdowling wants to merge 3 commits into
Conversation
…allback When the hqs library is installed and running inside Hopsworks (internal client), queries are executed in-process via hqs.HQSClient, skipping the Arrow Flight network hop. Cross-project queries (identified by hqs_payload_signature) are forwarded to the Arrow Flight server which has the private key and superuser HDFS access for decrypting connectors. - Add _should_use_local_hqs() detection (checks hqs importable + internal) - Add HQSLocalClient with hybrid routing (local vs Arrow Flight fallback) - Lazy ArrowFlightClient initialization (only on first signed query) - Fix bug in _disable_feature_query_service_client (was calling method on None) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…into hqs_as_library
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||
pyproject.toml: add `hqs` extra that pulls hopsworks[python] plus
hqs[snowflake,bigquery,postgres,mysql,oracle,redshift,delta-gcs] for
parity with the flyingduck Flight server image.
arrow_flight_client.py:
- Replace find_spec("hqs") activation heuristic with an explicit gate
that checks _is_external(), the cluster's enable_flyingduck flag, and
importability of both hqs and duckdb. Mere presence of hqs in the
install set is no longer sufficient.
- get_instance() now constructs ArrowFlightClient first to read the
cluster flag, then conditionally swaps to HQSLocalClient and passes
the remote instance as a fallback (no lazy reconstruction).
- HQSLocalClient uses HQSClient.from_pod_environment to derive a 50%
cgroup memory cap so a heavy join does not OOM the pod.
- Read through the public hqs_client.hopsfs property instead of
reaching into _engine.
- Drop blanket Exception catches that were swallowing FeatureStoreException.
- Drop dead no-op setters for host_url / timeout / health_check_timeout.
- _disable_for_session now clears _flight_fallback and sets
_disabled_for_session so a stale handle cannot resurrect routing.
tests/core/test_arrow_flight_client.py: add 12 tests covering the
activation gate (external, cluster-disabled, hqs missing, duckdb
missing, all gates pass), get_instance routing (external returns
ArrowFlightClient; internal+gates+local hqs swaps to HQSLocalClient
with fallback wired), and HQSLocalClient methods (unsigned executes
locally, signed forwards to fallback, _disable_for_session drops
fallback, __init__ uses from_pod_environment).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…allback
When the hqs library is installed and running inside Hopsworks (internal client), queries are executed in-process via hqs.HQSClient, skipping the Arrow Flight network hop. Cross-project queries (identified by hqs_payload_signature) are forwarded to the Arrow Flight server which has the private key and superuser HDFS access for decrypting connectors.
This PR adds/fixes/changes...
JIRA Issue: -
Priority for Review: -
Related PRs: -
How Has This Been Tested?
Checklist For The Assigned Reviewer: