Skip to content

Commit a9378a9

Browse files
Update documentation after fsspec rewrite (#30)
Updated documentation, including example notebooks. Added `CerealReader` and `CerealWriter` protocols to public API (to help with typing).
1 parent fa35bf7 commit a9378a9

10 files changed

Lines changed: 175 additions & 81 deletions

File tree

README.md

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -10,24 +10,29 @@ brings many changes and improvements, including a
1010

1111
This package, `pydantic-cereal`, is a small extension package that enables users to serialize Pydantic
1212
models with "arbitrary" (non-JSON-fiendly) types to "arbitrary" file-system-like locations.
13-
It uses [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/) and
14-
[`universal_pathlib`](https://pypi.org/project/universal-pathlib/) to support generic file systems.
13+
It uses [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/) to support generic file systems.
1514
Writing a custom writer (serializer) and reader (loader) with `fsspec` URIs is quite straightforward.
15+
You can also use [`universal-pathlib`](https://pypi.org/project/universal-pathlib/)'s
16+
`UPath` with `pydantic-cereal`.
1617

17-
See the [full documentation here](https://pydantic-cereal.readthedocs.io/).
18+
📘 See the [full documentation here](https://pydantic-cereal.readthedocs.io/). 📘
1819

1920
## Usage Example
2021

21-
For most uses, you only need the `Cereal` class and `fsspec` or `upath`.
22+
See the [minimal pure-Python example](./docs/examples/minimal.ipynb) to learn how to wrap your own type.
23+
Below is a preview of this example.
2224

2325
```python
24-
from upath import UPath # based on `fsspec`, used for `pathlib.Path`-like interface
26+
from fsspec import AbstractFileSystem
2527
from pydantic import BaseModel, ConfigDict
28+
2629
from pydantic_cereal import Cereal
2730

2831
cereal = Cereal() # This is a global variable
2932

3033

34+
# Create and "register" a custom type
35+
3136
class MyType(object):
3237
"""My custom type, which isn't a Pydantic model."""
3338

@@ -38,29 +43,24 @@ class MyType(object):
3843
return f"MyType({self.value})"
3944

4045

41-
# Create reader and writer from an fsspec URI
42-
43-
def my_reader(uri: str) -> MyType:
46+
def my_reader(fs: AbstractFileSystem, path: str) -> MyType:
4447
"""Read a MyType from an fsspec URI."""
45-
return MyType(value=UPath(uri).read_text())
48+
return MyType(value=fs.read_text(path)) # type: ignore
4649

4750

48-
def my_writer(obj: MyType, uri: str) -> None:
51+
def my_writer(obj: MyType, fs: AbstractFileSystem, path: str) -> None:
4952
"""Write a MyType object to an fsspec URI."""
50-
UPath(uri).write_text(obj.value)
51-
53+
fs.write_text(path, obj.value)
5254

53-
# "Register" this type with pydantic-cereal
5455
MyWrappedType = cereal.wrap_type(MyType, reader=my_reader, writer=my_writer)
55-
# NOTE: Your type isn't modified, we just apply `Annotated` with a custom serializer and validator
5656

5757

58-
# Use the wrapped type as the fields of your Pydantic model
58+
# Use type within Pydantic model
5959

6060
class MyModel(BaseModel):
6161
"""My custom Pydantic model."""
6262

63-
model_config = ConfigDict(arbitrary_types_allowed=True) # Pydantic configuration
63+
config = ConfigDict(arbitrary_types_allowed=True) # Pydantic configuration
6464
fld: MyWrappedType
6565

6666

@@ -76,5 +76,4 @@ assert isinstance(obj, MyModel)
7676
assert isinstance(obj.fld, MyType)
7777
```
7878

79-
For more detailed discussion, see the [minimal pure-Python example](./docs/examples/minimal.ipynb) or
80-
the [Pandas dataframe example](./docs/examples/pandas.ipynb)
79+
For wrapping 3rd-party libraries, see the [Pandas dataframe example](./docs/examples/pandas.ipynb).

docs/examples/minimal.ipynb

Lines changed: 58 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,9 @@
2121
"metadata": {},
2222
"outputs": [],
2323
"source": [
24-
"\"\"\"Minimal imports.\"\"\"\n",
24+
"\"\"\"Minimal imports to get started.\"\"\"\n",
2525
"\n",
26-
"from typing import NewType\n",
27-
"from upath import UPath\n",
26+
"from fsspec import AbstractFileSystem\n",
2827
"from pydantic import BaseModel, ConfigDict\n",
2928
"from pydantic_cereal import Cereal\n",
3029
"\n",
@@ -35,7 +34,7 @@
3534
"cell_type": "markdown",
3635
"metadata": {},
3736
"source": [
38-
"We have a custom type `MyType`, in this case it's just an alias for `str`:"
37+
"We have a custom type `MyType`, in this case it's just an wrapper for a `str`:"
3938
]
4039
},
4140
{
@@ -44,15 +43,29 @@
4443
"metadata": {},
4544
"outputs": [],
4645
"source": [
47-
"MyType = NewType(\"MyType\", str) # actually `str`, but type checker has special semantics "
46+
"class MyType(object):\n",
47+
" \"\"\"My custom type, which isn't a Pydantic model.\"\"\"\n",
48+
"\n",
49+
" def __init__(self, value: str):\n",
50+
" \"\"\"Initialize the object.\"\"\"\n",
51+
" self.value = str(value)\n",
52+
"\n",
53+
" def __repr__(self) -> str:\n",
54+
" \"\"\"Represent the string.\"\"\"\n",
55+
" return f\"MyType({self.value})\"\n",
56+
"\n",
57+
" def __str__(self) -> str:\n",
58+
" \"\"\"Return the internal string.\"\"\"\n",
59+
" return str(self.value)"
4860
]
4961
},
5062
{
5163
"cell_type": "markdown",
5264
"metadata": {},
5365
"source": [
5466
"We must add reader and writer classes for it. \n",
55-
"These must accept [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/) URIs as inputs.\n",
67+
"These must accept [`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/) file system\n",
68+
"and path (within that filesystem) as inputs.\n",
5669
"We can register these with our `cereal` object by creating a wrapped (`Annotated`) type."
5770
]
5871
},
@@ -62,13 +75,20 @@
6275
"metadata": {},
6376
"outputs": [],
6477
"source": [
65-
"def my_reader(uri: str) -> MyType:\n",
66-
" \"\"\"Read the object from an fsspec URI.\"\"\"\n",
67-
" return MyType(UPath(uri).read_text())\n",
78+
"def my_reader(fs: AbstractFileSystem, path: str) -> MyType:\n",
79+
" \"\"\"Read a MyType from an fsspec URI.\"\"\"\n",
80+
" res = fs.read_text(path)\n",
81+
" if isinstance(res, bytes):\n",
82+
" res = res.decode(\"utf8\")\n",
83+
" else:\n",
84+
" res = str(res)\n",
85+
" return MyType(value=res)\n",
86+
"\n",
87+
"\n",
88+
"def my_writer(obj: MyType, fs: AbstractFileSystem, path: str) -> None:\n",
89+
" \"\"\"Write a MyType object to an fsspec URI.\"\"\"\n",
90+
" fs.write_text(path, obj.value)\n",
6891
"\n",
69-
"def my_writer(obj: MyType, uri: str) -> None:\n",
70-
" \"\"\"Write the object to an fsspec URI.\"\"\"\n",
71-
" UPath(uri).write_text(obj)\n",
7292
"\n",
7393
"MyWrappedType = cereal.wrap_type(MyType, reader=my_reader, writer=my_writer)"
7494
]
@@ -145,7 +165,7 @@
145165
{
146166
"data": {
147167
"text/plain": [
148-
"MemoryPath('memory://example_model/')"
168+
"'/example_model'"
149169
]
150170
},
151171
"execution_count": 7,
@@ -172,7 +192,7 @@
172192
{
173193
"data": {
174194
"text/plain": [
175-
"'my_field'"
195+
"MyType(my_field)"
176196
]
177197
},
178198
"execution_count": 8,
@@ -201,7 +221,7 @@
201221
{
202222
"data": {
203223
"text/plain": [
204-
"'my_field'"
224+
"MyType(my_field)"
205225
]
206226
},
207227
"execution_count": 9,
@@ -224,22 +244,40 @@
224244
"cell_type": "code",
225245
"execution_count": 10,
226246
"metadata": {},
247+
"outputs": [],
248+
"source": [
249+
"from fsspec.implementations.memory import MemoryFileSystem # noqa"
250+
]
251+
},
252+
{
253+
"cell_type": "code",
254+
"execution_count": 11,
255+
"metadata": {},
256+
"outputs": [],
257+
"source": [
258+
"fs = MemoryFileSystem()"
259+
]
260+
},
261+
{
262+
"cell_type": "code",
263+
"execution_count": 12,
264+
"metadata": {},
227265
"outputs": [
228266
{
229267
"data": {
230268
"text/plain": [
231-
"[MemoryPath('memory://example_model/51c07fc879fa403993ba780d9ff29b52'),\n",
232-
" MemoryPath('memory://example_model/model.json'),\n",
233-
" MemoryPath('memory://example_model/model.schema.json')]"
269+
"['/example_model/9048f517f71f434aad4a5481f3b2b3d4',\n",
270+
" '/example_model/model.json',\n",
271+
" '/example_model/model.schema.json']"
234272
]
235273
},
236-
"execution_count": 10,
274+
"execution_count": 12,
237275
"metadata": {},
238276
"output_type": "execute_result"
239277
}
240278
],
241279
"source": [
242-
"list(UPath(\"memory://example_model\").glob(\"*\"))"
280+
"fs.glob(\"example_model/*\")"
243281
]
244282
},
245283
{

docs/examples/pandas.ipynb

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@
104104
{
105105
"data": {
106106
"text/plain": [
107-
"MemoryPath('memory://example_model/')"
107+
"'/example_model'"
108108
]
109109
},
110110
"execution_count": 5,
@@ -273,24 +273,35 @@
273273
},
274274
{
275275
"cell_type": "code",
276-
"execution_count": 8,
276+
"execution_count": 9,
277+
"metadata": {},
278+
"outputs": [],
279+
"source": [
280+
"from fsspec.implementations.memory import MemoryFileSystem # noqa\n",
281+
"\n",
282+
"fs = MemoryFileSystem()"
283+
]
284+
},
285+
{
286+
"cell_type": "code",
287+
"execution_count": 10,
277288
"metadata": {},
278289
"outputs": [
279290
{
280291
"data": {
281292
"text/plain": [
282-
"[MemoryPath('memory://example_model/example_model'),\n",
283-
" MemoryPath('memory://example_model/model.json'),\n",
284-
" MemoryPath('memory://example_model/model.schema.json')]"
293+
"['/example_model/3f084fb6b6024e8da54fb231e655c8ea',\n",
294+
" '/example_model/model.json',\n",
295+
" '/example_model/model.schema.json']"
285296
]
286297
},
287-
"execution_count": 8,
298+
"execution_count": 10,
288299
"metadata": {},
289300
"output_type": "execute_result"
290301
}
291302
],
292303
"source": [
293-
"list(UPath(\"memory://example_model\").glob(\"*\"))"
304+
"fs.glob(\"example_model/*\")"
294305
]
295306
},
296307
{

0 commit comments

Comments
 (0)