-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Labels
apiAPI design discussionsAPI design discussionsecosystemComp-chem ecosystem relatedComp-chem ecosystem relatedenhancementNew feature or requestNew feature or requestfeatureEntirely new features, not improvements to existing onesEntirely new features, not improvements to existing ones
Description
context
about 1 in 1e5 geometry optimizations goes off the rails (usually an issue with FIRE step size or MLIP stress preds) causing the cell to distort in strange ways, volume collapses, edge count explodes → OOM. this can bring down nodes doing long-running structure searches.
a possible current workaround is modifying the convergence check to monitor how many times the cell gets tiled for neighbor list construction and abort if it exceeds some threshold (e.g. 1k).
proposal
would be useful to have a general concept of lifecycle hooks in torch-sim that allows users to pass arbitrary callback functions. these could:
- perform custom sanity checks at every N-th step (N user-defined) (based on as many sim params as torch-sim can provide: cell volume, edge count, neighbor list tiling, etc.)
- modify sim params on the fly (e.g. reduce timestep if things look unstable)
- abort early with a user-intelligible reason if sim looks unrecoverable
something like:
def my_check(state: SimState) -> bool | str:
if state.neighbor_tiles > 1000:
return "cell tiling exceeded 1k, aborting"
return True # continue
result = step_func(state, model, ..., callbacks=[my_check])a general callback mechanism would let:
- users define domain-specific guardrails
- future tools (think custodian for MD) tap into the sim lifecycle
- debugging/logging without modifying torch-sim internals
tagging @kyonofx who brought this up, briefly discussed this with @abhijeetgangan
abhijeetgangan and curtischong
Metadata
Metadata
Assignees
Labels
apiAPI design discussionsAPI design discussionsecosystemComp-chem ecosystem relatedComp-chem ecosystem relatedenhancementNew feature or requestNew feature or requestfeatureEntirely new features, not improvements to existing onesEntirely new features, not improvements to existing ones