Discrepancies between ddf_utils.create_datapackage (Python) and validate-ddf -i (Node)

We are using both Python (https://github.com/semio/ddf_utils) and Javascript tooling to generate the datapackage.json with its ddfSchema property.

When running on the very same dataset, the Python-based generator results in a 50% larger datapackage.json file.

It would be interesting to hear your thoughts (@buchslava, @semio) about harmonising the two libraries. So far we have identified 4 differences in outcome:

**1. Resource.name is encoded differently:**

_validate-ddf_
"path": "ddf--entities--jurisdiction.csv",
"name": "jurisdiction"

_ddf_utils_
"path": "ddf--entities--jurisdiction.csv",
"name": "ddf--entities--jurisdiction"

**2. The default datapackage.json properties differ** 

The JavaScript version typically adds more placeholders such as title, license, author, version) whereas ddf_utils generates a bare minimum (name).

**3. Python ddf_utils does not seem to work with multiple measures in one file?**  

ddf--datapoints--measure--measure--by--country--year.csv

**4. Different files are excluded**

The Python tools seem to do a better job when it comes to excluding files from ddf creation. 
With `validate-ddf -i` .DS_Store and .ipynb files were accidentally encoded into the datapackage.json file whereas ddf_utils skipped over these.

Thanks for any pointers and ideas!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discrepancies between ddf_utils.create_datapackage (Python) and validate-ddf -i (Node) #548

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discrepancies between ddf_utils.create_datapackage (Python) and validate-ddf -i (Node) #548

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions