-
Notifications
You must be signed in to change notification settings - Fork 103
Add support for QBit data type #595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive support for the experimental QBit data type, which stores float vectors as bit-transposed tuples for optimized vector search operations. The implementation handles the client-side matrix transposition logic, allowing users to work with standard Python lists of floats while the driver manages the wire-format bit packing.
Key Changes:
- Implemented QBit transformation layer supporting Float32, Float64, and BFloat16 element types
- Added NumPy-accelerated fast path for transpose operations (20-50x speedup)
- Included pure Python fallback for environments without NumPy
- Added SQLAlchemy integration for QBit type
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| clickhouse_connect/datatypes/vector.py | Core QBit implementation with bit transposition logic and NumPy optimization |
| clickhouse_connect/cc_sqlalchemy/datatypes/sqltypes.py | SQLAlchemy QBit type wrapper |
| clickhouse_connect/datatypes/init.py | Module registration for vector types |
| tests/unit_tests/test_qbit_type.py | Comprehensive unit tests for QBit transposition and edge cases |
| tests/integration_tests/test_vector.py | Integration tests for QBit round-trip operations and database interactions |
| tests/unit_tests/test_sqlalchemy/test_types.py | SQLAlchemy QBit type tests |
| tests/integration_tests/test_sqlalchemy/test_ddl.py | SQLAlchemy DDL integration test |
| CHANGELOG.md | Release notes entry |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Adds full read/write support for the (currently) experimental
QBitdata type.QBitis a transformation type that stores vectors ofFloatsas bit-transposedTuplesofFixedStrings to optimize storage and search on the server. This implementation handles the matrix transposition logic on the client side, allowing users to interact with standard lists of floats while the driver handles the wire-format bit packing.Key implementation details
Transformation architecture
QBitnot as a standalone storage container, but as a transformation layer.FixedStringbuffers, and delegates the actual wire serialization to the existingTupleandFixedStringserializers.Tuple(FixedString)from the wire and untransposes the bit planes back into standard python lists of floats.Type support
QBitelement types:Float32,Float64, andBFloat16BFloat16is handled via bit-manipulation (truncating/shiftingFloat32s) since python lacks a native type.Performance
This transformation is pretty CPU intensive for interpreted languages like python, so if you're using the
QBittype, it is HIGHLY recommended to have NumPy installed in your environment. Because:Pure python fallback
If you can't install numpy in your env there is still a robust fallback, but it's not going to be fast.
Verification
References
Checklist
Delete items not relevant to your PR:
Closes #570