Skip to content

Conversation

oliverb123
Copy link
Contributor

@oliverb123 oliverb123 commented Oct 11, 2025

First step to accessing the general embeddings table from hogql queries, which unlocks fuzzy search and similarity work across products. There are still details to be worked out:

  • Exposing product, document_type, document_id, timestamp and distance for user-facing/dynamic use in WHERE and ORDER BY clauses. I think think means a new category of taxonomic property filter, and I simply didn't want to take on that headache right now, so there's a hacky half solution in place now. Callers from inside the django app can of course add whatever filters they like right to the AST prior to joining/calculating.
  • Add a k8s service to the embedding worker deployment, so talking to it works in prod

Also it lets users use us as a vector DB if we want (we'd need to give them a way to write to it tho):
image

ok maybe we're cooking
image

@oliverb123 oliverb123 requested a review from a team as a code owner October 11, 2025 01:36
@posthog-bot posthog-bot requested a review from a team October 11, 2025 01:37
Copy link
Contributor

github-actions bot commented Oct 11, 2025

Size Change: +53 B (0%)

Total Size: 3.05 MB

ℹ️ View Unchanged
Filename Size Change
frontend/dist/toolbar.js 3.05 MB +53 B (0%)

compressed-size-action

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@daibhin daibhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments. I'll review the query runner properly on Monday. Might be worth getting the DW folks to look at the HogQL changes

Copy link
Member

@Gilbert09 Gilbert09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just update the table name - the rest of the hogql looks good!

Copy link
Contributor

@daibhin daibhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, mostly around explaining how the query works given it's relative complexity. The API is pretty intuitive but the query itself is harder to understand. Maybe tests would be the best description

@oliverb123 oliverb123 force-pushed the embed/add-query-runner branch from 9ab5751 to 44baa96 Compare October 15, 2025 13:06
@oliverb123 oliverb123 merged commit 9471397 into master Oct 16, 2025
254 of 260 checks passed
@oliverb123 oliverb123 deleted the embed/add-query-runner branch October 16, 2025 10:20
@oliverb123 oliverb123 restored the embed/add-query-runner branch October 16, 2025 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants