-
Notifications
You must be signed in to change notification settings - Fork 2k
feat(embed): add query runner #39517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Size Change: +53 B (0%) Total Size: 3.05 MB ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 files reviewed, 2 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a couple of comments. I'll review the query runner properly on Monday. Might be worth getting the DW folks to look at the HogQL changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just update the table name - the rest of the hogql looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments, mostly around explaining how the query works given it's relative complexity. The API is pretty intuitive but the query itself is harder to understand. Maybe tests would be the best description
9ab5751
to
44baa96
Compare
First step to accessing the general embeddings table from hogql queries, which unlocks fuzzy search and similarity work across products. There are still details to be worked out:
product
,document_type
,document_id
,timestamp
anddistance
for user-facing/dynamic use in WHERE and ORDER BY clauses. I think think means a new category of taxonomic property filter, and I simply didn't want to take on that headache right now, so there's a hacky half solution in place now. Callers from inside the django app can of course add whatever filters they like right to the AST prior to joining/calculating.Also it lets users use us as a vector DB if we want (we'd need to give them a way to write to it tho):

ok maybe we're cooking
