Skip to content

Conversation

@abelsiqueira
Copy link
Member

@abelsiqueira abelsiqueira commented Apr 14, 2025

  • Create a data validation function, with a basic validation of the expected table and columns
  • Add a small tutorial
  • Add namespaces
  • Create convenience function to transform wide table in long table
  • Create a convenience function to call all cluster things at once

Related issues

Closes #40

Checklist

  • I am following the contributing guidelines
  • Tests are passing
  • Lint workflow is passing
  • Docs were updated and workflow is passing

@abelsiqueira abelsiqueira force-pushed the abel/obz branch 2 times, most recently from d6cd6d0 to 423a5f8 Compare April 14, 2025 22:10
@codecov
Copy link

codecov bot commented Apr 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.72%. Comparing base (b8a5a8b) to head (fa13b9e).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #90      +/-   ##
==========================================
+ Coverage   99.65%   99.72%   +0.06%     
==========================================
  Files           4        6       +2     
  Lines         291      359      +68     
==========================================
+ Hits          290      358      +68     
  Misses          1        1              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines 37 to 80
adaptive_grad::Bool = false,
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
adaptive_grad::Bool = false,
)
adaptive_grad::Bool = false,
clustering_kwargs = Dict(),
weight_fitting_kwargs = Dict(),
)

Comment on lines 57 to 101
) |> DataFrame
split_into_periods!(df; period_duration)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) |> DataFrame
split_into_periods!(df; period_duration)
) |> DataFrame
combine_periods!(df)
split_into_periods!(df; period_duration)

DuckDB.register_data_frame(connection, clustering_result.profiles, "profiles_rep_periods")
mapping_df = weight_matrix_to_df(clustering_result.weight_matrix)

prefix = ""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prefix = ""
# Below we register the dataframes as `t_<name>`, because we can't directly
# create them in the `database_schema`.
# Afterwards, we copy them to the schema and drop these `t_<name>` views.
prefix = ""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't forget to include this one. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's included

Comment on lines 35 to 37
niters::Int = 100,
learning_rate::Float64 = 0.001,
adaptive_grad::Bool = false,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to dict below


connection = _new_connection(; profile_names, years, num_timesteps)

clusters = cluster!(connection, period_duration, num_rps; database_schema)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
clusters = cluster!(connection, period_duration, num_rps; database_schema)
clusters = cluster!(connection, period_duration, num_rps; database_schema,
clustering_kwargs = Dict(:display = :iter),
weight_fitting_kwargs = Dict(
)

@abelsiqueira abelsiqueira changed the title [WIP] Updates relevant to the workflow/data pipeline Add database schemas, convenience functions, and small tutorial May 12, 2025
@abelsiqueira abelsiqueira marked this pull request as ready for review May 12, 2025 19:07
@abelsiqueira
Copy link
Member Author

@greg-neustroev @datejada I think this is updated now. Since it doesn't depend on TulipaIO, it is ready for review. I didn't put much effort into the tutorial, but I decided to leave to have something about the convenience functions. Let me know if you want it removed.

Copy link
Member

@datejada datejada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abelsiqueira, thanks for the PR and the changes. The tutorial looks great, we need it.

I followed the changes. My comment is about running it twice. I am unsure if the current code allows us to run it twice with the same connection. It looks like it will throw the error that the table already exists. Would you mind commenting on that (and changing it if needed to be able to run it twice)?

Thanks!

@abelsiqueira
Copy link
Member Author

Hi @datejada, I've added tests running it twice. Only transforming from wide to long was not working for me. Do you have a different experience?

@datejada
Copy link
Member

Hi @datejada, I've added tests running it twice. Only transforming from wide to long was not working for me. Do you have a different experience?

That was it 😄 Thanks!

Copy link
Member

@datejada datejada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. I just left a minor comment so you resolve the comments you had from yesterday. I'm approving it and merging it when it is ready.

@abelsiqueira
Copy link
Member Author

I included the comment manually in file src/io.jl. Can you confirm that's the change you meant?

@datejada
Copy link
Member

I included the comment manually in file src/io.jl. Can you confirm that's the change you meant?

yes, I figured it out afterwards. All your own comments are addressed, so I would close them to avoid confusion 🤷‍♂️

@datejada datejada merged commit b5b1caa into main May 13, 2025
7 checks passed
@datejada datejada deleted the abel/obz branch May 13, 2025 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement input validation

3 participants