Skip to content

No_tag columns in the exposure #10583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

No_tag columns in the exposure #10583

wants to merge 2 commits into from

Conversation

micheles
Copy link
Contributor

@micheles micheles commented May 13, 2025

As requested by @CatalinaYepes . See also #7885.

@micheles micheles added this to the Engine 3.24.0 milestone May 13, 2025
@micheles micheles requested a review from CatalinaYepes May 13, 2025 13:53
@raoanirudh
Copy link
Member

The engine implicitly filling in missing tag values in a user's exposure could lead to unexpected secondary issues, so I'm not fully in favor of this change. Writing below for future reference the different cases that could arise with missing columns or missing values for certain columns in the exposure:

  • A column is included in in the exposure.xml, but the column is absent in one or more of the exposure.csv files. This case should result in an error, the current behavior is ok:
    https://github.com/gem/oq-engine/blob/engine-3.23/openquake/risklib/asset.py#L1104-L1110
  • A column is included in in the exposure.xml, and the column is present in all of the exposure.csv files, but some values for this tag are missing in one or more of the exposure.csv files. Here we have two subcases:

@micheles
Copy link
Contributor Author

but some values for this tag are missing in one or more of the exposure.csv files

By this you mean that the tag value is the string "No_tag"? Should the engine check for that and raise an error?

@raoanirudh
Copy link
Member

but some values for this tag are missing in one or more of the exposure.csv files

By this you mean that the tag value is the string "No_tag"? Should the engine check for that and raise an error?

Not the string "No_tag", but anything that would be flagged by isna().

@micheles
Copy link
Contributor Author

micheles commented May 14, 2025

@raoanirudh perhaps you are missing the fact that when reading the CSV files the engine uses keep_default_na=False which means that there are NEVER NaNs in the exposure. The origin of the NaNs can be only one, pandas.concat() when concatenating the dataframes associated to the CSV files, which adds NaNs for missing columns and such columns can then be filled with No_tag automatically. There is no risk of confusing the case of a missing column with the case of a column being present with some NaN value, because if a column is present all of its values are not NaN (at the level of a single CSV file).

@micheles micheles closed this May 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants