-
Notifications
You must be signed in to change notification settings - Fork 25
Support importing (some) virtual sites in Interchange.from_openmm
#1081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1081 +/- ##
==========================================
+ Coverage 93.66% 93.72% +0.05%
==========================================
Files 71 72 +1
Lines 6095 6216 +121
==========================================
+ Hits 5709 5826 +117
- Misses 386 390 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this on a hand-built OpenMM Simulation featuring a single water molecule with made up parameters, and also on a single Tip4P molecule. Things mostly worked, and errors were mostly clear. I do have a few notes:
export INTERCHANGE_EXPERIMENTAL=1
seems to be required, but this does not seem to be documented and the error message doesn't make this clear. This was the only time I was confused by an error message.- At the moment, the number of atoms in the imported topology must match the number of particles in the system. I think we should consider if topologies matching the number of real atoms in the system should be accepted as well. This would allow OpenFF
Topology
objects to work when importing systems with supported virtual sites, as they currently do for systems without virtual sites. This makes particular sense given that it appears that virtual sites are stripped out of the OpenFF topology that's available frominterchange.topology
. - The exported OpenMM topology via
to_openmm_simulation()
had its virtual sites in a separate chain and residue - I think it'd be better if they were in the same chain and residue as the parent atom. - This is unrelated to the PR, but I tried a hand-built system without bonds (single water molecule with constraints, but no bonds force), and Interchange complained:
136 # TODO: Does this run through the Interchange.box validator?
137 interchange.box = _box_vectors
--> 139 if interchange.topology.n_bonds > len(interchange.collections["Bonds"].key_map):
140 # There are probably missing (physics) bonds from rigid waters. The topological
141 # bonds are probably processed correctly.
142 _fill_in_rigid_water_bonds(interchange)
144 return interchange
KeyError: 'Bonds'
Since this code currently attempts to account for bond parameters missing from the bond parameters, maybe we should account for this possibility?
A full review for such a big patch is gonna take me some time to work through, especially since I'm gonna need to get my head around how everything works, but those are my notes from some independent testing.
But yeah this is great! Things seem to be working pretty robustly once you come up with a system and topology that works - I was able to swap the topology of simple molecules out for a "real" openff topology after the Interchange was constructed and things kept working. Round trips work, energy minimization works from Interchange land, and visualizations work perfectly.
topology._molecule_virtual_site_map = defaultdict(list) | ||
|
||
# TODO: This iteration strategy scales horribly with system size - need to refactor - | ||
# since looking up topology atom indices is slow. It's probably repetitive to | ||
# look up the molecule index over and over again | ||
for particle in virtual_sites: | ||
molecule_index = topology.molecule_index(topology.atom(particle.orientation_atom_indices[0]).molecule) | ||
|
||
topology._molecule_virtual_site_map[molecule_index].append(particle) | ||
|
||
topology._particle_map = openmm_openff_particle_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be reasonable to squish topology._particle_map
and topology._molecule_virtual_site_map
into atom metadata rather than monkey patching? I think the answer is probably "no", and I see this information is moved to the Interchange
, and I've tested that Interchange.topology
seems to be able to be swapped out for a true topology (with full molecular graphs etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly; I don't love the current design but it has the benefit of not interacting with anything the user might do in the public API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken, I think you could make the return type of this function something like -> tuple[Topology, defaultdict, dict]
, make these attributes on Interchange
that get set in the parent function from that return value, and then pass them in as additional arguments to _convert_vsites()
. I think that covers all the information flow needed? Just need to remove some .topology
s from a few other places.
Thanks for the review
This shouldn't be the case in the 0.4.x line, possibly your local setup is out of date. I've removed some obsolete stuff in tests now that
Agree, but something for future work. I want to get this functionality over the finish line in its current scope as soon as I can
Right, this was the result of #1051
Probably yeah, topological edges being defined different ways is pretty frustrating to work with |
Co-authored-by: Josh A. Mitchell <[email protected]>
I think this is ready for another pass;
My hope is that this is more or less good to go in its current state, but if there are more changes needed before it finds a way into a release that's fine. OpenFE is a key stakeholder for this feature but I don't think there's significant time pressure from them. I think everything else is either out of the original scope or something I forgot! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an enormous amount of code that lays the basis for a lot of future work and the fact that it works reflects a pretty amazing effort!
I checked this with a notebook I wrote. First, it creates a single "water" molecule topology and system by hand and imports that, then round trip exports it. Then I created a small box of OPC water (because it uses a 3 particle average virtual site) with Modeller, imported that, and roundtripped it. Notebook here - just rename to .ipynb
: openmm_vsite_import.txt
Here's what I found:
- The modeller-created system worked perfectly and was easy to load, visualize, and export. This is probably the most fair and relevant test, and I imagine should represent the vast majority of OpenMM system use.
- My hand-built system imported and visualized fine, but when I roundtripped it back to OpenMM the resulting system didn't have any particles. I wonder if this had something to do with the fact that it didn't have any forces. This was really not a fair test or something anyone would ever do, but hey if that's worth fixing then there it is.
- The
topology
argument ofInterchange.from_openmm
is optional (has a default value, hasNone
as a possible type, and is described as optional in the docstring), but a value ofNone
unconditionally raises an error - this should just be a required argument. Not a blocker. - An OpenFF topology can't be used, because it can't contain vsites and therefore has the wrong number of atoms - I mentioned this in my previous review and I think it got discussed as if I was talking about positions, but specifically I think that a topology that matches the number of real atoms should be accepted. This allows users to include complete chemical information instead of leaning on
_SimpleMolecule
s etc. Not a blocker here but definitely a nice to have.
Notes from looking at the code:
LocalCoordinatesSite
is in_SUPPORTED_VIRTUAL_SITE_TYPES
, but is not supported - I've added a note on the code below- I have some notes on the docs page
- Users can currently provide OpenFF topologies when importing OpenMM topologies - I've used this in vignettes to demonstrate how users can take advantage of having full chemical information even when constructing an interchange from OpenMM. We should not regress this by having OpenFF topologies raise an
AttributeError
, even if we do not support this use case for systems with virtual sites. More details below.
All that said, I think this is just about ready to go and most of the above feedback can be addressed in future PRs. The three blockers I've identified so far are the suggestions I made to edges.md
, removing LocalCoordinatesSite
from _SUPPORTED_VIRTUAL_SITE_TYPES
, and addressing the regression with OpenFF topologies.
I haven't looked in great detail at the guts of the collection or export code, and I'm happy to do that tomorrow - my eyes are just bouncing off it right now and its the end of my day. So I'll finish this off tomorrow - I expect to approve it, but such a detailed PR deserves a full review.
topology._molecule_virtual_site_map = defaultdict(list) | ||
topology._particle_map = {index: index for index in range(topology.n_atoms)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the fact that this is now necessary a regression? from_openmm
used to support giving an OpenFF topology so that users could specify/record the precise chemical identity of stuff, even when it came from OpenMM-land. It seems bad that they now need to invoke this arcana for that to work or else get a cryptic message - I think topologies that lack these attributes should continue to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've decided to make this change - even though it's probably a regression, this behavior was never tested. I'd like to come back to it and add tests / fix anything that this breaks. And I can hardly add more to this PR in its current state so I must do it later
topology._molecule_virtual_site_map = defaultdict(list) | ||
|
||
# TODO: This iteration strategy scales horribly with system size - need to refactor - | ||
# since looking up topology atom indices is slow. It's probably repetitive to | ||
# look up the molecule index over and over again | ||
for particle in virtual_sites: | ||
molecule_index = topology.molecule_index(topology.atom(particle.orientation_atom_indices[0]).molecule) | ||
|
||
topology._molecule_virtual_site_map[molecule_index].append(particle) | ||
|
||
topology._particle_map = openmm_openff_particle_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken, I think you could make the return type of this function something like -> tuple[Topology, defaultdict, dict]
, make these attributes on Interchange
that get set in the parent function from that return value, and then pass them in as additional arguments to _convert_vsites()
. I think that covers all the information flow needed? Just need to remove some .topology
s from a few other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internal code all looks good! I have suggested revisions, mostly to user-facing code and docs, but only the three points I raised in my previous comments are what I'd consider blocking, and I don't think any of them should need another round of review. Thanks for this feature, it opens the way for exciting things!
openff/interchange/_tests/unit_tests/interop/openmm/_import/test_virtual_sites.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Josh A. Mitchell <[email protected]>
Description
Resolves #1063