Skip to content

Add new tables to calculate the lowest resolution for flows that have a flows relationship #1183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 8, 2025

Conversation

datejada
Copy link
Member

@datejada datejada commented Apr 29, 2025

This pull request refactors the create_merged_tables! function to simplify its implementation, introduces a new table for flow relationships, and expands test coverage for data preparation.

Refactoring and Simplification:

  • Replaced inline SQL queries in create_merged_tables! with an external SQL file (create-merged-tables.sql) for better maintainability and readability. (src/data-preparation.jl, [1] [2] [3]

New Feature: Flow Relationships:

  • Added a new temporary table, merged_flows_relationship, to handle flow relationships, including creating and populating a flows_relationship column in the flows_relationships table. (src/sql/create-merged-tables.sql, src/sql/create-merged-tables.sqlR1-R97)
  • Updated the create_lowest_resolution_table! function to include the new merged_flows_relationship table in its processing. (src/data-preparation.jl, src/data-preparation.jlL497-R451)

Expanded Test Coverage:

  • Introduced a comprehensive test suite in test/test-data-preparation.jl to validate the correctness of the merged tables, including the new merged_flows_relationship table, and the lowest and highest resolution tables. (test/test-data-preparation.jl, test/test-data-preparation.jlR1-R234)
  • Added a helper function _test_rows_exist to verify the presence of specific rows in test tables. (test/utils.jl, test/utils.jlR20-R33)

Related issues

Closes #1182

Checklist

  • I am following the contributing guidelines
  • Tests are passing
  • Lint workflow is passing
  • [NA] Docs were updated and workflow is passing

Copy link
Contributor

github-actions bot commented Apr 29, 2025

✅ MPS files match

🤖 This was CompareMPS, we hope you have enjoyed this program.

Copy link

codecov bot commented Apr 29, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.79%. Comparing base (9ef34c6) to head (8bae049).
Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1183      +/-   ##
==========================================
+ Coverage   97.70%   97.79%   +0.09%     
==========================================
  Files          30       31       +1     
  Lines        1132     1133       +1     
==========================================
+ Hits         1106     1108       +2     
+ Misses         26       25       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@datejada datejada added the benchmark PR only - Run benchmark on PR label Apr 29, 2025
Copy link
Contributor

github-actions bot commented Apr 29, 2025

Benchmark Results

9ef34c6... 8bae049... 9ef34c6... / 8bae049...
energy_problem/create_model 34.5 ± 2.1 s 33.7 ± 1.5 s 1.02
energy_problem/input_and_constructor 24 ± 0.16 s 24.2 ± 0.35 s 0.988
time_to_load 2.64 ± 0.023 s 2.63 ± 0.043 s 1
9ef34c6... 8bae049... 9ef34c6... / 8bae049...
energy_problem/create_model 0.199 G allocs: 11.7 GB 0.199 G allocs: 11.7 GB 1
energy_problem/input_and_constructor 0.0354 G allocs: 1.23 GB 0.0354 G allocs: 1.23 GB 1
time_to_load 0.159 k allocs: 11.2 kB 0.159 k allocs: 11.2 kB 1

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

This comment was marked as outdated.

@datejada datejada marked this pull request as ready for review May 2, 2025 07:49
@datejada
Copy link
Member Author

datejada commented May 2, 2025

@suvayu Inspired by our last conversation, my next step in the new feature is to be able to test the main pieces of the code without doing a full integration test. The flexible temporal resolution is key for Tulipa, so I have created a separate test to test independently the three main functions that calculate these tables. I have also added my new flexible temporal resolution function and tested it for the mock-up data.

Please let me know your comments and suggestions.

@gnawin, also, from the developers' maintainability perspective, please let me know if the tests make sense or if you would like to add/delete something else.

@datejada datejada requested review from suvayu and gnawin May 2, 2025 07:49
@datejada datejada changed the title Add new merged table for flows relationships Add new tables to calculate the lowest resolution for flows that have a flows relationship May 2, 2025

DBInterface.execute(
connection,
"CREATE TABLE rep_periods_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this table rep_periods_data (because I don't see it directly in this PR)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need it to test the create_highest_resolution_table!, which is not part of my PR, but as @suvayu says, if you can add/fix something in your PR, just do it 😝

Copy link
Member

@gnawin gnawin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datejada thank you! I think you tested nicely, and you cover more than the purpose of MIMO because you tested all lowest_resolution and highest_resolution tables, great! I have some minor suggestions regarding the testing, to improve some consistency with other tests.

@datejada
Copy link
Member Author

datejada commented May 2, 2025

@datejada thank you! I think you tested nicely, and you cover more than the purpose of MIMO because you tested all lowest_resolution and highest_resolution tables, great! I have some minor suggestions regarding the testing, to improve some consistency with other tests.

Thanks! I like the suggestions, let's see if we can make it work by registering the dataframes. So far, I failed when I tried as I commented here: #1183 (comment)

@datejada
Copy link
Member Author

datejada commented May 2, 2025

@gnawin, thanks for the comments and discussion. I have made two main changes:

  • Create the mock case study using Dataframes
  • The test will be added to the size of the tables to ensure they have the expected number of rows and columns.

The second change is to balance between:
a) comparing the whole dataframes one by one, or
b) focusing on one asset + table size as an extra check that all went okay

Approach a) is 100% safe, but it is not easy to understand why you have all the rows. Approach b) is not 100% safe, but it makes the test readable by focusing on only what is happening in one asset in all the cases + as a second check, that the output tables are the correct size (one would expect these checks to be enough).

I am still trying to balance both approaches, but let's discuss with @suvayu and @abelsiqueira the best way to proceed.

@datejada datejada requested a review from gnawin May 2, 2025 20:42
@suvayu
Copy link
Member

suvayu commented May 3, 2025

Uff! I got tagged so many times :-p I'll have a look :)

@abelsiqueira
Copy link
Member

Again, I don't have the whole context. If needed, ping me again and I'll review the whole PR.
I think a clean enough way, not too verbose, that tests fully the expected assets, would be

expected_df = DataFrame(
... # I think you can pretty much use the same array of tuples
)
computed_df = DuckDB.query("SELECT .. WHERE asset = 'asset' SORT BY ...") |> DataFrame
@test dataframes_are_equal(expected_df, computed_df)

If all computed_df follow the same WHERE and SORT BY, you can create another function

expected_df = ...
@test expected_df_matches_filtered_table(expected_df, target_table)

Copy link
Contributor

github-actions bot commented May 7, 2025

🤖 CompareMPS report

✅ MPS files match

@datejada datejada requested a review from abelsiqueira May 7, 2025 14:39
@datejada
Copy link
Member Author

datejada commented May 7, 2025

@suvayu @abelsiqueira @gnawin I have included all the comments as we discussed. Please let me know if you have further suggestions.

Copy link
Member

@abelsiqueira abelsiqueira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM.
💀 ⭐

Copy link
Member

@gnawin gnawin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We have discussed this PR in person. For the future, it would be nice to generalize the function for creating highest/lowest resolutions, maybe changing the name from asset to something else is already good enough. Already in #1192, I'm just reiterating its usefulness.

Copy link
Member

@suvayu suvayu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@datejada datejada merged commit 1a1aa31 into main May 8, 2025
8 checks passed
@datejada datejada deleted the 1182-t-lowest-flows-relationship branch May 8, 2025 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark PR only - Run benchmark on PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a table with the lowest resolution of both flows in the flows relationship table
4 participants