consider removing rows where scenario data is not available

https://github.com/RMI-PACTA/pacta.data.preparation/blob/ba0f8b8518afb2d00bfe5d9bff1a935418eaa5dd/R/dataprep_abcd_scen_connection.R#L267-L303

When the scenario data is left_joined with the ABCD data, it's possible/likely that some rows of the ABCD data will not match any rows in the scenario data `by = c("scenario_geography", "year", "ald_sector", "technology")`, and therefore the columns from the scenario data that are added (`scenario_source`, `scenario`, `units`, `direction`, `fair_share_perc`) will be filled with `NA` for those rows. Are these rows useful at all after this point?

I think we should carefully consider whether these lines with no scenario data are meaningful for any reason, and if not we should filter them out to potentially reduce the size of the data substantially. @jacobvjk @jdhoffa @AlexAxthelm 

It's possible we do want at least one row of the ABCD data to be left in place even if no scenario data matches it, in which case we'll need something more sophisticated... though the `scenario_geography` and `equity_market` columns will make multiple rows distinct even while the rest of the data is duplicated?

related RMI-PACTA/pacta.data.preparation#7 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

consider removing rows where scenario data is not available #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

consider removing rows where scenario data is not available #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions