Skip to content

Commit edcdaa0

Browse files
deploy: 2f837e6
0 parents  commit edcdaa0

173 files changed

Lines changed: 79116 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.nojekyll

Whitespace-only changes.

CNAME

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
nbdev.fast.ai

api/clean.html

Lines changed: 1059 additions & 0 deletions
Large diffs are not rendered by default.

api/clean.html.md

Lines changed: 236 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,236 @@
1+
# clean
2+
3+
4+
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
5+
6+
To avoid pointless conflicts while working with jupyter notebooks (with
7+
different execution counts or cell metadata), it is recommended to clean
8+
the notebooks before committing anything (done automatically if you
9+
install the git hooks with `nbdev-install-hooks`). The following
10+
functions are used to do that. Cleaning also adds cell `id`s if missing
11+
(required by nbformat 4.5+).
12+
13+
## Trust
14+
15+
------------------------------------------------------------------------
16+
17+
<a
18+
href="https://github.com/AnswerDotAI/nbdev/blob/main/nbdev/clean.py#L25"
19+
target="_blank" style="float:right; font-size:smaller">source</a>
20+
21+
### nbdev_trust
22+
23+
``` python
24+
25+
def nbdev_trust(
26+
fname:str=None, # A notebook name or glob to trust
27+
force_all:bool=False, # Also trust notebooks that haven't changed
28+
):
29+
30+
```
31+
32+
*Trust notebooks matching `fname`.*
33+
34+
## Clean
35+
36+
------------------------------------------------------------------------
37+
38+
<a
39+
href="https://github.com/AnswerDotAI/nbdev/blob/main/nbdev/clean.py#L87"
40+
target="_blank" style="float:right; font-size:smaller">source</a>
41+
42+
### clean_nb
43+
44+
``` python
45+
46+
def clean_nb(
47+
nb, # The notebook to clean
48+
clear_all:bool=False, # Remove all cell metadata and cell outputs?
49+
allowed_metadata_keys:list=None, # Preserve the list of keys in the main notebook metadata
50+
allowed_cell_metadata_keys:list=None, # Preserve the list of keys in cell level metadata
51+
clean_ids:bool=True, # Remove ids from plaintext reprs?
52+
allowed_out_metadata_keys:list=None, # Preserve the list of keys in output metadata
53+
):
54+
55+
```
56+
57+
*Clean `nb` from superfluous metadata*
58+
59+
Jupyter adds a trailing <code></code> to images in cell outputs.
60+
Vscode-jupyter does not.
61+
Notebooks should be brought to a common style to avoid unnecessary
62+
diffs:
63+
64+
``` python
65+
test_nb = read_nb('../../tests/image.ipynb')
66+
assert test_nb.cells[0].outputs[0].data['image/png'][-1] == "\n" # Make sure it was not converted by acccident
67+
clean_nb(test_nb)
68+
assert test_nb.cells[0].outputs[0].data['image/png'][-1] != "\n"
69+
```
70+
71+
The test notebook has metadata in both the main metadata section and
72+
contains cell level metadata in the second cell:
73+
74+
``` python
75+
test_nb = read_nb('../../tests/metadata.ipynb')
76+
77+
assert {'meta', 'jekyll', 'my_extra_key', 'my_removed_key'} <= test_nb.metadata.keys()
78+
assert {'meta', 'hide_input', 'my_extra_cell_key', 'my_removed_cell_key'} == test_nb.cells[1].metadata.keys()
79+
```
80+
81+
After cleaning the notebook, all extra metadata is removed, only some
82+
keys are allowed by default:
83+
84+
``` python
85+
clean_nb(test_nb)
86+
87+
assert {'jekyll', 'kernelspec'} == test_nb.metadata.keys()
88+
assert {'hide_input'} == test_nb.cells[1].metadata.keys()
89+
```
90+
91+
We can preserve some additional keys at the notebook or cell levels:
92+
93+
``` python
94+
test_nb = read_nb('../../tests/metadata.ipynb')
95+
clean_nb(test_nb, allowed_metadata_keys={'my_extra_key'}, allowed_cell_metadata_keys={'my_extra_cell_key'})
96+
97+
assert {'jekyll', 'kernelspec', 'my_extra_key'} == test_nb.metadata.keys()
98+
assert {'hide_input', 'my_extra_cell_key'} == test_nb.cells[1].metadata.keys()
99+
```
100+
101+
Passing `clear_all=True` removes everything from the cell metadata:
102+
103+
``` python
104+
test_nb = read_nb('../../tests/metadata.ipynb')
105+
clean_nb(test_nb, clear_all=True)
106+
107+
assert {'jekyll', 'kernelspec'} == test_nb.metadata.keys()
108+
test_eq(test_nb.cells[1].metadata, {})
109+
```
110+
111+
Passing `clean_ids=True` removes `id`s from plaintext repr outputs, to
112+
avoid notebooks whose contents change on each run since they often lead
113+
to git merge conflicts. For example:
114+
115+
<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at 0x7FB4F8979690>
116+
117+
becomes:
118+
119+
<PIL.PngImagePlugin.PngImageFile image mode=L size=28x28>
120+
121+
*Cell* IDs, on the other hand, are always added if missing
122+
123+
``` python
124+
test_cell = {'source': 'x=1', 'cell_type': 'code', 'metadata': {}}
125+
_clean_cell(test_cell, False, set(), True, set())
126+
test_cell['id']
127+
```
128+
129+
'88ba1c41'
130+
131+
------------------------------------------------------------------------
132+
133+
<a
134+
href="https://github.com/AnswerDotAI/nbdev/blob/main/nbdev/clean.py#L115"
135+
target="_blank" style="float:right; font-size:smaller">source</a>
136+
137+
### process_write
138+
139+
``` python
140+
141+
def process_write(
142+
warn_msg, proc_nb, f_in, f_out:NoneType=None, disp:bool=False
143+
):
144+
145+
```
146+
147+
*Call self as a function.*
148+
149+
------------------------------------------------------------------------
150+
151+
<a
152+
href="https://github.com/AnswerDotAI/nbdev/blob/main/nbdev/clean.py#L139"
153+
target="_blank" style="float:right; font-size:smaller">source</a>
154+
155+
### nbdev_clean
156+
157+
``` python
158+
159+
def nbdev_clean(
160+
fname:str=None, # A notebook name or glob to clean
161+
clear_all:bool=False, # Remove all cell metadata and cell outputs?
162+
disp:bool=False, # Print the cleaned outputs
163+
stdin:bool=False, # Read notebook from input stream
164+
):
165+
166+
```
167+
168+
*Clean all notebooks in `fname` to avoid merge conflicts*
169+
170+
By default (`fname` left to `None`), all the notebooks in
171+
`config.nbs_path` are cleaned. You can opt in to fully clean the
172+
notebook by removing every bit of metadata and the cell outputs by
173+
passing `clear_all=True`.
174+
175+
If you want to keep some keys in the main notebook metadata you can set
176+
`allowed_metadata_keys` in `[tool.nbdev]` in `pyproject.toml`. Similarly
177+
for cell level metadata use `allowed_cell_metadata_keys`, and for output
178+
metadata use `allowed_out_metadata_keys`. For example, to preserve both
179+
`k1` and `k2` at both the notebook and cell level add the following to
180+
`pyproject.toml`:
181+
182+
``` toml
183+
[tool.nbdev]
184+
allowed_metadata_keys = ["k1", "k2"]
185+
allowed_cell_metadata_keys = ["k1", "k2"]
186+
allowed_out_metadata_keys = ["k1", "k2"]
187+
```
188+
189+
------------------------------------------------------------------------
190+
191+
<a
192+
href="https://github.com/AnswerDotAI/nbdev/blob/main/nbdev/clean.py#L154"
193+
target="_blank" style="float:right; font-size:smaller">source</a>
194+
195+
### clean_jupyter
196+
197+
``` python
198+
199+
def clean_jupyter(
200+
path, model, kwargs:VAR_KEYWORD
201+
):
202+
203+
```
204+
205+
*Clean Jupyter `model` pre save to `path`*
206+
207+
This cleans notebooks on-save to avoid unnecessary merge conflicts. The
208+
easiest way to install it for both Jupyter Notebook and Lab is by
209+
running `nbdev-install-hooks`. It works by implementing a
210+
`pre_save_hook` from Jupyter’s [file save hook
211+
API](https://jupyter-server.readthedocs.io/en/latest/developers/savehooks.html).
212+
213+
## Hooks
214+
215+
------------------------------------------------------------------------
216+
217+
<a
218+
href="https://github.com/AnswerDotAI/nbdev/blob/main/nbdev/clean.py#L195"
219+
target="_blank" style="float:right; font-size:smaller">source</a>
220+
221+
### nbdev_install_hooks
222+
223+
``` python
224+
225+
def nbdev_install_hooks(
226+
227+
):
228+
229+
```
230+
231+
*Install Jupyter and git hooks to automatically clean, trust, and fix
232+
merge conflicts in notebooks*
233+
234+
See
235+
[`clean_jupyter`](https://nbdev.fast.ai/api/clean.html#clean_jupyter)
236+
and `nbdev-merge` for more about how each hook works.

0 commit comments

Comments
 (0)