Skip to content

Commit 76f5b2f

Browse files
committed
Add features parameter to IterableDatasetDict.map
IterableDataset.map accepts a features parameter to declare the output schema, but IterableDatasetDict.map did not expose it. This meant users of IterableDatasetDict had no way to preserve feature metadata through map operations.
1 parent 6f9d24f commit 76f5b2f

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

src/datasets/dataset_dict.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2105,6 +2105,7 @@ def map(
21052105
batch_size: int = 1000,
21062106
drop_last_batch: bool = False,
21072107
remove_columns: Optional[Union[str, list[str]]] = None,
2108+
features: Optional[Features] = None,
21082109
fn_kwargs: Optional[dict] = None,
21092110
) -> "IterableDatasetDict":
21102111
"""
@@ -2156,6 +2157,8 @@ def map(
21562157
Remove a selection of columns while doing the mapping.
21572158
Columns will be removed before updating the examples with the output of `function`, i.e. if `function` is adding
21582159
columns with names in `remove_columns`, these columns will be kept.
2160+
features (`[Features]`, *optional*, defaults to `None`):
2161+
Feature types of the resulting dataset.
21592162
fn_kwargs (`Dict`, *optional*, defaults to `None`):
21602163
Keyword arguments to be passed to `function`
21612164
@@ -2187,6 +2190,7 @@ def map(
21872190
batch_size=batch_size,
21882191
drop_last_batch=drop_last_batch,
21892192
remove_columns=remove_columns,
2193+
features=features,
21902194
fn_kwargs=fn_kwargs,
21912195
)
21922196

0 commit comments

Comments
 (0)