Skip to content

Add unit test for saving extra modules #8290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/docs/tutorials/saving/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,14 +98,15 @@ You can pick the suitable saving approach based on your needs.

### Serializing Imported Modules

When saving a program with `save_program=True`, you might need to include custom modules that your program depends on.
When saving a program with `save_program=True`, you might need to include custom modules that your program depends on. This is
necessary if your program depends on these modules, but at loading time these modules are not imported before calling `dspy.load`.

You can specify which custom modules should be serialized with your program by passing them to the `modules_to_serialize`
parameter when calling `save`. This ensures that any dependencies your program relies on are included during serialization and
available when loading the program later.

This uses cloudpickle's `cloudpickle.register_pickle_by_value` function in order to register a module as picklable by value. When
a module is registered this way, cloudpickle will serialize the module by value rather than by reference, ensuring that the
Under the hood this uses cloudpickle's `cloudpickle.register_pickle_by_value` function to register a module as picklable by value.
When a module is registered this way, cloudpickle will serialize the module by value rather than by reference, ensuring that the
module contents are preserved with the saved program.

For example, if your program uses custom modules:
Expand Down
59 changes: 59 additions & 0 deletions tests/primitives/test_module.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,65 @@ def dummy_metric(example, pred, trace=None):
assert new_cot.predict.demos == compiled_cot.predict.demos


def test_save_with_extra_modules(tmp_path):
import sys

# Create a temporary Python file with our custom module
custom_module_path = tmp_path / "custom_module.py"
with open(custom_module_path, "w") as f:
f.write("""
import dspy

class MyModule(dspy.Module):
def __init__(self):
self.cot = dspy.ChainOfThought(dspy.Signature("q -> a"))

def forward(self, q):
return self.cot(q=q)
""")

# Add the tmp_path to Python path so we can import the module
sys.path.insert(0, str(tmp_path))
try:
import custom_module

cot = custom_module.MyModule()

cot.save(tmp_path, save_program=True)
# Remove the custom module from sys.modules to simulate it not being available
sys.modules.pop("custom_module", None)
# Also remove it from sys.path
sys.path.remove(str(tmp_path))
del custom_module

# Test the loading fails without using `modules_to_serialize`
with pytest.raises(ModuleNotFoundError):
dspy.load(tmp_path)

sys.path.insert(0, str(tmp_path))
import custom_module

cot.save(
tmp_path,
modules_to_serialize=[custom_module],
save_program=True,
)

# Remove the custom module from sys.modules to simulate it not being available
sys.modules.pop("custom_module", None)
# Also remove it from sys.path
sys.path.remove(str(tmp_path))
del custom_module

loaded_module = dspy.load(tmp_path)
assert loaded_module.cot.predict.signature == cot.cot.predict.signature

finally:
# Only need to clean up sys.path
if str(tmp_path) in sys.path:
sys.path.remove(str(tmp_path))


def test_load_with_version_mismatch(tmp_path):
from dspy.primitives.module import logger

Expand Down