-
Notifications
You must be signed in to change notification settings - Fork 95
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Initially reported by @goeffthomas.
import tensorflow_datasets as tfds
builder = tfds.dataset_builders.CroissantBuilder(
jsonld="https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud/croissant/download",
file_format='array_record',
)
builder.download_and_prepare()
ds = builder.as_data_source()
print(ds['default'][0])FWIW, even the demo code here doesn't seem to work: https://www.tensorflow.org/datasets/format_specific_dataset_builders#croissantbuilder_2
Addition by @marcenacp:
For me on the latest version of tfds-nightly, it even fails with another error:
**************************** WARNING *********************************
Warning: The dataset you're trying to generate is using Apache Beam,
yet no `beam_runner` nor `beam_options` was explicitly provided.
Some Beam datasets take weeks to generate, so are usually not suited
for single machine generation. Please have a look at the instructions
to setup distributed generation:
https://www.tensorflow.org/datasets/beam_datasets#generating_a_beam_dataset
**********************************************************************
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-3-445fac78df9a> in <cell line: 6>()
4 file_format='array_record',
5 )
----> 6 builder.download_and_prepare()
7 ds = builder.as_data_source()
8 print(ds['default'][0])
12 frames
/usr/lib/python3.10/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_)
ModuleNotFoundError: No module named 'apache_beam'
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done