Skip to content

Add on-the-fly queries#47

Merged
luciaquirke merged 12 commits intomainfrom
query-2
Oct 16, 2025
Merged

Add on-the-fly queries#47
luciaquirke merged 12 commits intomainfrom
query-2

Conversation

@luciaquirke
Copy link
Collaborator

@luciaquirke luciaquirke commented Oct 13, 2025

  • Add on the fly queries
  • Precompute query dataset if not already available
  • Use .part extension of in-progress runs (closes Use .part extension for in-progress index runs #49)
  • You can technically now torch.compile the model when projection_dim=0 and save_index=False but this slows down the build and projection_dim=0 is super memory hungry

TODO

  • Extract out the query dataset assembly into an example script and consider making it a more official tool in the future

Notes

If keeping the extra gradients in VRAM before the query callback causes problems we can add something like

offload_to_cpu: bool = False
"""If True, keep value gradients on CPU until the query callback is called."""

But it's only a small memory usage increase.

@@ -0,0 +1,454 @@
import os
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified copy of build.py

@luciaquirke luciaquirke changed the base branch from query to main October 13, 2025 22:41
@luciaquirke luciaquirke force-pushed the query-2 branch 2 times, most recently from 0b2edc9 to 07f6af8 Compare October 14, 2025 07:14

# Asynchronously move the gradient to CPU and convert to fp16
mod_grads[name] = g.to(device="cpu", dtype=dtype, non_blocking=True)
if save_index:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid the round trip to cpu


precision: Literal["auto", "bf16", "fp16", "fp32", "int4", "int8"] = "auto"
"""Precision to use for the model parameters."""
"""Precision (dtype) to use for the model parameters."""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improve searchability

@luciaquirke luciaquirke force-pushed the query-2 branch 3 times, most recently from 9fe9bff to 353a1d7 Compare October 16, 2025 02:04
dtype=dtype,
fill_value=0.0,
)
per_doc_scores = torch.full(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only support one score per doc, i.e. don't support computing module scores separately for now

"""Number of examples to use for estimating processor statistics."""

drop_columns: bool = False
drop_columns: bool = True
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prevent duplicating entire dataset on disk by default

@luciaquirke luciaquirke changed the title [WIP] Add on-the-fly queries Add on-the-fly queries Oct 16, 2025
@luciaquirke luciaquirke force-pushed the query-2 branch 3 times, most recently from 6405510 to 9de0ecb Compare October 16, 2025 06:51
@luciaquirke luciaquirke merged commit d3dba3b into main Oct 16, 2025
3 checks passed
@luciaquirke luciaquirke deleted the query-2 branch November 17, 2025 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use .part extension for in-progress index runs

1 participant