-
Notifications
You must be signed in to change notification settings - Fork 2
Feature/scaling multi species #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/scaling-multi-species
Are you sure you want to change the base?
Feature/scaling multi species #99
Conversation
implementation scaling for individual species
tests and example updated to match the new scaling function
Change the values in create_time_evoltion.py to match a more eco-friendly evolution. Update the calculations for scaling in interpolate_time.py to have a more sustainable code.
stefan-voelk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The preferred workflow is that the feature branch is created within our dlr-pa repository. You would clone the existing repository to your local computer and syncing the newly created local branch with remote on dlr-pa. A major advantage of this approach is that reviewers could more easily execute the modified code within an existing OpenAirClim installation. Only the particular feature branch has to be activated ("checkout") then.
- If you don't want to perform a Pull Request (PR) immediately to main, we usually create a Draft Pull Request to start a conversation with the reviewers.
- The actual PR to the feature branch on dlr-pa does not need approval by a reviewer. Therefore, you can go ahead with the merge to the feature branch and my comments provided. Once a PR to main is created, an approval is needed before the merge.
- The previous PR #94 can be closed.
- One of the automatic checks failed. This must be fixed before merging to main. The particular check is related to a workflow creating the demonstration examples for the website documentation. Since the scaling method changed, but the example was not updated, this check failed. The documentation itself on the time evolution scaling option should be updated as well.
| SCALING_TIME = np.arange(1990, 2200, 1) | ||
| SCALING_ARR = np.sin(SCALING_TIME * 0.2) * 0.6 + 1.0 | ||
| SCALING_ARR = SCALING_ARR.astype("float32") | ||
| SPECIES = ["fuel", "CO2", "H2O", "NOx", "distance"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the times, I named lists and numpy arrays xxx_arr in OpenAirClim. In addition, species can be a singular and plural word. Therefore, I suggest renaming this list to SPECIES_ARR.
|
|
||
|
|
||
| def plot_time_scaling(scaling_time: np.ndarray, scaling_arr: np.ndarray): | ||
| def plot_time_scaling(scaling_time: np.ndarray, scaling: np.ndarray, species: list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As stated above. I suggest renaming to scaling_arr and species_arr. I used the notation _arr for arrays of any dimension throughout OpenAirClim.
| ax.set_xlabel("year") | ||
| ax.set_ylabel("scaling factor") | ||
| plt.figure(figsize=(8, 4)) | ||
| for i, specie in enumerate(species): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good example why the notation _arr is unambiguous. First, the reader can better distinguish the two objects species_arr (np.ndarray) and species (str) in the code. Second, the word specie (without "s" at the end) means something different, a coin ;-)
| ax.plot(scaling_time, scaling_arr) | ||
| ax.set_xlabel("year") | ||
| ax.set_ylabel("scaling factor") | ||
| plt.figure(figsize=(8, 4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To better distinguish the two plots which are opened upon execution of this script, I suggest to add a string identifier, e.g. "Scaling" as first argument to function plt.figure(), see : https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.figure.html
Instead of "Figure 1", a nice name would show up then.
|
|
||
| def create_time_scaling_xr( | ||
| scaling_time: np.ndarray, scaling_arr: np.ndarray | ||
| scaling_time: np.ndarray, scaling: np.ndarray, species: list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above, suggest renaming variables to xxx_arr
| Author="Stefan Völk", | ||
| Contact="stefan.voelk@dlr.de", | ||
| Author="Stefan Völk, Clémentin Léron", | ||
| Contact="openairclim@dlr.de", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for updating the attributes/metadata 👍
| np.linspace(1.0, 1.5, len(SCALING_TIME)), # H2O: +50% growth | ||
| np.linspace(1.0, 0.8, len(SCALING_TIME)), # NOx: -20% reduction | ||
| np.linspace(1.0, 1.2, len(SCALING_TIME)) # distance: +120% growth | ||
| ]).astype("float32") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the scaling factors. Looks much more realistic!
| inv_years = np.array(list(inv_dict.keys())) | ||
|
|
||
| # Interpolate scaling factors linearly on time_range | ||
| time_range, evo_interp_dict = interp_evolution(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works fine, even with higher dimensional scaling factors array and with time steps > 1 year.
| "dir": "../oac/example/input/", | ||
| "file": "time_scaling_example.nc", | ||
| "range": [2020, 2051, 1], | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some guidelines and good practices for software testing. I can provide you with some additional documentation on that topic if you like:
- According to the testing pyramid there are three types: unit tests (base), integration tests (middle) and end-to-end tests (top). The base comes with the highest amount of tests, top has the fewest. This means one should focus on the unit tests. These tests are fast, isolated and check individual aspects of the individual implemented functions.
- Decouple tests from external dependencies, e.g. other files in the repository. This concept is known as test isolation. This particular test would fail, if the utility script generating
time_scaling_example.nchas not executed beforehand. Test isolation can be achieved by the concept of test doubles (e.g. mock objects). If the focus is unit testing instead of end-to-end, maybe the file itself is not needed anymore and it would be enough to code a minimal xarray Dataset which contains an example time evolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Pull Request (PR) is from an external repository to the dlr-pa repository. Once the branch is within dlr-pa, reviewers can perform much easier checks with the new code. Otherwise, a fresh clone and install of OpenAirClim from the external repository is needed.
Description
It extends the time interpolation and scaling capabilities of OpenAirClim to support individual scaling factors per species in the emission inventory.
This allows, for instance, different scaling factors to be applied when using multiple fuel types or accounting for species-specific emission changes.
This improvement addresses the TODO note previously present in apply_scaling:
Main changes
Updated scale_inv and apply_scaling to handle multiple emission scaling factors (one per species).
Modified to generate a coherent example including species-dependent scaling.
Updated unit tests to reflect the new behavior.
Type of change
How has this been tested?
<class 'netCDF4.Dataset'> root group (NETCDF4 data model, file format HDF5): Title: Time scaling example Convention: CF-XXX Type: scaling dimensions(sizes): time(210), species(5) variables(dimensions): float32 scaling(time, species), int64 time(time), <class 'str'> species(species) groups:Example
print(file2read.variables['species'][:]) # ['fuel' 'CO2' 'H2O' 'NOx' 'distance']To reproduce:
Test configuration:
Checklist
__about__.pyaccording to the semantic versioning scheme