Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .cursor/rules/python.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ alwaysApply: true
- Every module must have a docstring
- Prefer dict literals `{"foo": "bar"}` to `dict(foo="bar")`
- When defining new constants or literals, store them in src/utils/constants.py
- Use the `Self` type, don't use string literals as types
- Use dataclasses when possible

# Writing a new function

Expand Down
20 changes: 20 additions & 0 deletions docs/notebooks/clean/docking-single-ligand.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,16 @@
"sim"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2d8bfa9a-c142-4deb-ac3d-d2a4f581df55",
"metadata": {},
"outputs": [],
"source": [
"protein._remote_path"
]
},
{
"cell_type": "markdown",
"id": "e2aa58aa",
Expand All @@ -115,6 +125,16 @@
"ligands"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "95f716a7-4a4c-4c23-9d61-130fc3e9a72f",
"metadata": {},
"outputs": [],
"source": [
"ligands.to_smiles()"
]
},
{
"cell_type": "markdown",
"id": "94e19bec-6bee-4dbf-9c47-1490c41fdbd0",
Expand Down
123 changes: 123 additions & 0 deletions protein-dock-refactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
Implementation Status vs. Agreed-On Plan

## Desired API interface for running `protein.dock()` with quoting

```Python
# just quote a function (single ligand)
results = protein.dock(ligand=ligand, quote=True)

results.data # None
results.estimate # object -- includes dollar estimate, free actions, etc.


# just quote a function (many ligand)
#
# results.poses <--- an array across all ligands
# results.estimate <--- includes all the ligands
results = protein.dock(ligands=ligands, quote=True)

results.data # None
results.estimate # object -- should be an estimate for the total of all ligands

# just run the function (single ligand)
results = protein.dock(ligand=ligand, quote=False)

results.data # contains a LigandSet of poses
results.cost # object

# just run the function (many ligands)
results = protein.dock(ligands=ligands, quote=False)

results.data # an list across all ligands -- has poses from all ligands
results.cost # a single cost, across all ligands.


results = protein.find_pockets(pocket_count=1, quote=True)
results.data # None
results.estimate # object -- should be an estimate for the total of all ligands

results = protein.find_pockets(pocket_count=1, quote=False)
results.data # a list of pockets
results.cost # a single cost

# i want to use up to $100 (one ligand)
results = protein.dock(ligand=ligand, max_cost=Cost(100))

# i want to use up to $100 (many ligands)
# internally, what happens is that we divide the max cost by the
# number of ligands, and make individual function calls using the computed
# per-ligand max cost
results = protein.dock(ligands=ligands, max_cost=Cost(100))

# i only want to use 1 free actions (single ligand)
results = protein.dock(ligand=ligand, max_cost=Cost(free_actions=1))

# i only want to use 5 free actions (5 ligand)
# internally it checks if the number of ligands is <= the number of free actions
# and if so, allows it
results = protein.dock(ligands=ligands, max_cost=Cost(free_actions=5))

# use 1 free action and $100
# this cannot be supported -- each function call is indepndent.
# the platform API doesn't allow this OR logic -- if we want to do it,
# we'll have to do it client side
results = protein.dock(ligand=ligand, max_cost=Cost(100, {"DO_DOCK": 1}))
```

## What's NOT implemented (vs. the agreed-on plan)

1. Cost class — Does NOT exist
The proposal describes:
results = protein.dock(ligand=ligand, max_cost=Cost(100))
results = protein.dock(ligand=ligand, max_cost=Cost({"DO_DOCK": 5}))
results = protein.dock(ligand=ligand, max_cost=Cost(100, {"DO_DOCK": 1}))
There is no Cost class anywhere in the codebase. No max_cost parameter on any method.

2. max_cost parameter — Does NOT exist
None of the run() methods accept a max_cost parameter. The only cost control is through approve_amount: Optional[int] (on Docking/ABFE), which is a raw integer dollar amount passed directly as approveAmount in
the API payload. There's no support for action-based cost limits like Cost({"DO_DOCK": 5}).

3. results.estimate / results.cost properties — Do NOT exist on the result object
The proposal describes:
results = protein.dock(ligand=ligand, quote=True)
results.estimate # object -- includes dollar estimate, free actions, etc.
The current implementation returns a JobList (for Docking/ABFE tools) or raw dict (for functions like pocket finder). There is no .estimate or .cost property on Job/JobList. The cost/quote data is buried inside
job._attributes["quotationResult"] and is only surfaced in the HTML visualization widget, not as a clean Python property.

4. results.poses returning None when quote=True — Not applicable
Docking.run(quote=True) returns a JobList of "Quoted" jobs. To get poses you'd call docking.get_results() separately — the proposal's results.poses pattern (where a single return object has both .poses and
.estimate) is not implemented.

5. protein.dock() doesn't support quoting — Different API surface
The Protein.dock() method (protein.py:271-376) is a local/synchronous docking function that returns a LigandSet directly. It does not have quote or max_cost parameters and doesn't use the billing system at all.
The quote=True flow only exists on the Docking class (accessible via complex.docking.run(quote=True)), not on protein.dock().


---
Summary Table

┌────────────────────────────────────┬─────────────────┬──────────────────────────────────────────────────────────────────────┐
│ Agreed-on plan feature │ Status │ Notes │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ quote=True parameter │ Implemented │ On Docking, ABFE, pocket finder, protonation │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ results.estimate property │ Not implemented │ Cost data only in _attributes["quotationResult"], rendered in widget │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ results.poses / results.cost │ Not implemented │ No unified result object with both data and cost │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ job.confirm() │ Implemented │ Works on Job and JobList │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ job.cancel() │ Implemented │ Works on Job and JobList │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ max_cost=Cost(100) │ Not implemented │ No Cost class, no max_cost param │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ max_cost=Cost({"DO_DOCK": 5}) │ Not implemented │ No action-based cost limits │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ max_cost=Cost(100, {"DO_DOCK": 1}) │ Not implemented │ No mixed cost limits │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ Billing in Job widget │ Implemented │ Billing tab with quotationResult + billingTransaction │
├────────────────────────────────────┼─────────────────┼──────────────────────────────────────────────────────────────────────┤
│ protein.dock(quote=...) │ Not implemented │ protein.dock() is a different, local API without billing |
└────────────────────────────────────┴─────────────────┴──────────────────────────────────────────────────────────────────────┘

The core quoting flow (quote=True → Quoted status → confirm()) is fully working for Docking and ABFE. The main gaps are: no Cost class, no max_cost parameter, no .estimate/.cost properties on the result, and
91 changes: 91 additions & 0 deletions scripts/capture_docking_fixture.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
"""Temporary script to capture docking API responses as fixtures.

Runs the same docking call as test_docking_lv2 against dev with record=True,
then copies the downloaded SDF file into the mock server's fixtures/files directory.
"""

import os
from pathlib import Path
import shutil

from deeporigin.drug_discovery import BRD_DATA_DIR, Ligand, Pocket, Protein
from deeporigin.platform.client import DeepOriginClient


def main():
"""Capture docking fixtures from the dev environment."""
os.environ["DEEPORIGIN_ENV"] = "dev"

FIXTURES_DIR = Path(__file__).parent.parent / "tests" / "fixtures"

client = DeepOriginClient.get(record=True, replace=True)

protein = Protein.from_file(BRD_DATA_DIR / "brd.pdb")
protein.remove_water()
pocket = Pocket.from_pdb_file(
FIXTURES_DIR / "pockets" / "brd_pocket_1.pdb", name="brd_pocket_1"
)

ligand = Ligand.from_smiles(
"Fc1c(-c2cccc3ccccc23)ncc2c(N3C[C@H]4CC[C@@H](C3)N4)nc(OCC34CCCN3CCC4)nc12"
)

result = protein.dock(
ligand=ligand,
pocket=pocket,
quote=False,
use_cache=False,
)

print(f"Result: {result}")
print(f"Result data: {result.data}")
print(f"Result cost: {result.cost}")

from deeporigin.utils.core import hash_dict

payload = {
"protein_path": protein._remote_path,
"ligand_smiles": ligand.smiles,
"box_size": [20.0, 20.0, 20.0],
"pocket_center": pocket.get_center().tolist(),
}
cache_hash = hash_dict(payload)
print(f"Payload: {payload}")
print(f"Cache hash: {cache_hash}")

normalized_body = {"inputs": payload}
body_hash = hash_dict(normalized_body)
print(f"Body hash (for fixture lookup): {body_hash}")

fixture_json = (
FIXTURES_DIR / "function-runs" / "deeporigin.docking" / f"{body_hash}.json"
)
print(f"Fixture JSON exists: {fixture_json.exists()}")
print(f"Fixture JSON path: {fixture_json}")

import json

if fixture_json.exists():
with open(fixture_json) as f:
response_data = json.load(f)
sdf_remote_path = response_data.get("functionOutputs", {}).get("sdf_path")
if sdf_remote_path:
print(f"SDF remote path from fixture: {sdf_remote_path}")
sdf_fixture_path = FIXTURES_DIR / "files" / sdf_remote_path
sdf_fixture_path.parent.mkdir(parents=True, exist_ok=True)

from deeporigin.utils.core import _ensure_do_folder

local_sdf = str(Path(_ensure_do_folder() / "docking") / f"{cache_hash}.sdf")
print(f"Local SDF path: {local_sdf}")
if os.path.exists(local_sdf):
shutil.copy2(local_sdf, sdf_fixture_path)
print(f"Copied SDF to fixture: {sdf_fixture_path}")
else:
print(f"Local SDF not found at {local_sdf}")
else:
print("Fixture JSON was NOT created by record mode")


if __name__ == "__main__":
main()
54 changes: 54 additions & 0 deletions scripts/capture_docking_quote_lv1_fixture.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
"""Temporary script to capture the single-ligand quote=True docking fixture."""

import os
from pathlib import Path

from deeporigin.drug_discovery import BRD_DATA_DIR, Pocket, Protein
from deeporigin.drug_discovery.structures import Ligand
from deeporigin.platform.client import DeepOriginClient
from deeporigin.utils.core import hash_dict

FIXTURES_DIR = Path(__file__).parent.parent / "tests" / "fixtures"


def main() -> None:
"""Capture the fixture."""
os.environ["DEEPORIGIN_ENV"] = "dev"
client = DeepOriginClient.get(record=True, replace=True)

protein = Protein.from_file(BRD_DATA_DIR / "brd.pdb")
protein.remove_water()
pocket = Pocket.from_pdb_file(
FIXTURES_DIR / "pockets" / "brd_pocket_1.pdb", name="brd_pocket_1"
)

ligand = Ligand.from_smiles(
"Fc1c(-c2cccc3ccccc23)ncc2c(N3C[C@H]4CC[C@@H](C3)N4)nc(OCC34CCCN3CCC4)nc12"
)

result = protein.dock(
ligand=ligand,
pocket=pocket,
quote=True,
use_cache=False,
client=client,
)

print(f"Result: {result}")
print(f"Estimate: {result.estimate}")

payload = {
"protein_path": protein._remote_path,
"ligand_smiles": ligand.smiles,
"box_size": [20.0, 20.0, 20.0],
"pocket_center": pocket.get_center().tolist(),
}
body_hash = hash_dict({"inputs": payload, "approveAmount": 0})
fixture = (
FIXTURES_DIR / "function-runs" / "deeporigin.docking" / f"{body_hash}.json"
)
print(f"Fixture: {fixture.name} exists={fixture.exists()}")


if __name__ == "__main__":
main()
Loading
Loading