Skip to content

Commit 53a724e

Browse files
committed
Updates h_diskcache README
It had some typos, and otherwise needed a little more explanation of what this adapter is doing.
1 parent 5cf320e commit 53a724e

File tree

1 file changed

+14
-5
lines changed

1 file changed

+14
-5
lines changed

examples/cache_hook/README.md

+14-5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Cache hook
2-
This hook uses the [diskcache](https://grantjenks.com/docs/diskcache/tutorial.html) to cache node execution on disk. The cache key is a tuple of the function's `(source code, input a, ..., input n)`.
2+
This hook uses the [diskcache](https://grantjenks.com/docs/diskcache/tutorial.html) to cache node execution on disk. The cache key is a tuple of the function's
3+
`(source code, input a, ..., input n)`. This means, a function will only be executed once for a given set of inputs,
4+
and source code hash. The cache is stored in a directory of your choice, and it can be shared across different runs of your
5+
code. That way as you develop, if the inputs and the code haven't changed, the function will not be executed again and
6+
instead the result will be retrieved from the cache.
37

48
> 💡 This can be a great tool for developing inside a Jupyter notebook or other interactive environments.
59
@@ -8,9 +12,13 @@ Disk cache has great features to:
812
- set automated eviction policy once maximum size is reached
913
- allow custom `Disk` implementations to change the serialization protocol (e.g., pickle, JSON)
1014

11-
> ⚠ The default `Disk` serializes objects using the `pickle` module. Changing Python or library versions could break your cache (both keys and values). Learn more about [caveats](https://grantjenks.com/docs/diskcache/tutorial.html#caveats).
15+
> ⚠ The default `Disk` serializes objects using the `pickle` module. Changing Python or library versions could break your
16+
> cache (both keys and values). Learn more about [caveats](https://grantjenks.com/docs/diskcache/tutorial.html#caveats).
1217
13-
> ❓ To store artifacts robustly, please use Hamilton materializers or the [CachingGraphAdapter](https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/caching_nodes) instead. The `CachingGraphAdapter` stores tagged nodes directly on the file system using common formats (JSON, CSV, Parquet, etc.). However, it isn't aware of your function version and requires you to manually manage your disk space.
18+
> ❓ To store artifacts robustly, please use Hamilton materializers or the
19+
> [CachingGraphAdapter](https://github.com/DAGWorks-Inc/hamilton/tree/main/examples/caching_nodes) instead.
20+
> The `CachingGraphAdapter` stores tagged nodes directly on the file system using common formats (JSON, CSV, Parquet, etc.).
21+
> However, it isn't aware of your function version and requires you to manually manage your disk space.
1422
1523

1624
# How to use it
@@ -42,7 +50,8 @@ logger.addHandler(logging.StreamHandler())
4250
- DEBUG will return inputs for each node and specify if the value is `from cache` or `executed`
4351

4452
## Clear cache
45-
The utility function `h_diskcache.evict_except_driver` allows you to clear cached values for all nodes except those in the passed driver. This is an efficient tool to clear old artifacts as your project evolves.
53+
The utility function `h_diskcache.evict_all_except_driver` allows you to clear cached values for all nodes except those in the passed driver.
54+
This is an efficient tool to clear old artifacts as your project evolves.
4655

4756
```python
4857
from hamilton import driver
@@ -55,7 +64,7 @@ dr = (
5564
.with_adapters(h_diskcache.CacheHook())
5665
.build()
5766
)
58-
h_diskcache_evict_except_driver(dr)
67+
h_diskcache.evict_all_except_driver(dr)
5968
```
6069

6170
## Cache settings

0 commit comments

Comments
 (0)