Skip to content

Test on-the-fly query; fix bugs; add e2e build test#62

Merged
luciaquirke merged 6 commits intomainfrom
harden-otf
Nov 10, 2025
Merged

Test on-the-fly query; fix bugs; add e2e build test#62
luciaquirke merged 6 commits intomainfrom
harden-otf

Conversation

@luciaquirke
Copy link
Collaborator

@luciaquirke luciaquirke commented Nov 7, 2025

  • Add support for module-wise queries
  • Basic query tests
  • Fix partial run directory rename (shutil is more robust and will handle name collisions by saving the partial directory inside the complete one)
  • Fix IndexConfig save to disk
  • Add E2E build CLI test
  • Fix bug in data batching where it would sometimes go into an infinite loop
  • Use Path everywhere

TODO

  • Possibly rename module_wise to something more clear
  • Don't route indices through the gradient collector, use nonlocal var in collection or something
  • Make it more tidy or flexible somehow, right now it's just strategies hard-coded as needed.
    • Separate query gradient transform for unit norm / mean beforehand?
    • Right now we do normalized mean of cosine sims but users might want sum of cosine sims...
  • Add better query tests

Follow-up:

  • Support alternative query IO for full grads (1-100GB files)
    • If unstructured memmaps are viable we can use them everywhere and provide the structured memmap on request as a view on top of them, no layout change required
  • Support module-wise index builds?
  • Remove the ScoreWriter ABC if we don't merge the CsvWriter soon, currently we have an abstract class with 1 implementation :')

@luciaquirke luciaquirke changed the title Test query Test on-the-fly query; fix bugs; add e2e build test Nov 7, 2025
@luciaquirke luciaquirke requested a review from LouisYRYJ November 7, 2025 23:41
@luciaquirke luciaquirke force-pushed the harden-otf branch 3 times, most recently from d41ec9c to a9e8dfd Compare November 8, 2025 01:20
@luciaquirke luciaquirke merged commit defe3eb into main Nov 10, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant