-
Notifications
You must be signed in to change notification settings - Fork 94
Open
Description
I have a metadata.json file with minimal croissant metadata for a dataset. The dataset has multiple zip files for which I am using cr:FileObject. When I validate the metadata.json file using mlcroissant python package, I get error for FileObject.
Here is how I validate my metadata.json
mlcroissant validate --jsonld metadata.json
Here is the error I get for FileObjec:
Found the following 9 error(s) during the validation:
- "ICLR2024_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- "ICLR2025_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- "ICML2024_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- "ICML2025_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- "NeurIPS2021_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- "NeurIPS2022_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- "NeurIPS2023_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- "NeurIPS2024_latest.zip" should have an attribute "@type": "https://schema.org/FileObject" or "@type": "https://schema.org/FileSet". Got https://mlcommons.org/croissant/1.0/FileObject instead.
- [Metadata(Academic Papers Dataset)] The name "Academic Papers Dataset" contains forbidden characters.
The last error about name is also interesting, what forbidden characters are there in the name? Maybe Dataset
Here is my complete metadata.json
Metadata
Metadata
Assignees
Labels
No labels