Validate numerical output produced by individual data sources

As far as data source processing goes, we currently test each component of a DataPipeline object.

And we have a dry run in https://github.com/GoogleCloudPlatform/covid-19-open-data/blob/main/src/test/test_source_run.py to make sure, for each individual data source, that there is at least one output whose location key matches a defined regex.

But we do not validate that individual extensions of DataSource (stored in src/pipelines/*/*.py) actually produce the proper numerical output for particular inputs.

I propose unit testing the parse_dataframes method in each data source.  To make this easier, perhaps we could have a framework that accepts input and output dataframes as CSV files to make them easier to specify.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Validate numerical output produced by individual data sources #453

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Validate numerical output produced by individual data sources #453

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions