Skip to content

History matching refine model #423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 34 commits into from
Apr 29, 2025
Merged

Conversation

marjanfamili
Copy link
Collaborator

@marjanfamili marjanfamili commented Apr 24, 2025

πŸ“ Description

This pull request introduces

  • workflow for user-provided simulator
  • NaghaviSimulator
  • ** changes to base simulator**
  • HistoryMatcher
  • HistoryMatchingDashboard

πŸ‘©β€πŸ’» Usage

  • follow the demo 07_AE_workflow.ipynb

πŸ†˜ Help Needed

  • Need to implement calibration method for completing the pipeline
  • Need a test which specifically tests the simulator written by the user. This should not be included in the tests folder as it is not an autoemulate test. maybe a test in the simulations folder

πŸ”— Related Issue

#294 #412

πŸ›  Type of Change

  • ✨ New feature
  • πŸ“– Documentation update
  • ♻️ Refactor/Code cleanup
  • πŸ§ͺ Test updates
  • πŸŽ“ Tutorial added (new guides, walkthroughs, or examples)

βœ… Checklist

  • My code follows the project's coding style 🎨
  • I have tested my changes locally πŸ–₯️
  • I have added/updated unit tests (if applicable) πŸ§ͺ
  • I have updated the documentation (README, comments, etc.) if needed πŸ“š
  • My changes generate no new warnings or errors ⚠️❌

πŸ–ΌοΈ Screenshots (if applicable)

workflow

πŸ’¬ Additional Notes


Reviewers:
πŸ‘€ Pay special attention to:

  • the change to the simulation base class
  • updates to the hoiitory matching pipeline
  • the history matching test has been re-written

Copy link

Check out this pull request onΒ  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

github-actions bot commented Apr 24, 2025

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
Β Β autoemulate
Β Β history_matching.py 43, 76, 110-127, 164, 209-223, 233-234, 247-251, 265, 279-299, 311, 319, 350-419
Β Β history_matching_dashboard.py 1-1241
Β Β autoemulate/emulators
Β Β gaussian_process.py
Β Β autoemulate/simulations
Β Β base.py 60, 86-97, 111-141, 156-172, 176, 199-205
Β Β circ_utils.py 4-233
Β Β naghavi_cardiac_ModularCirc.py 1-172
Β Β tests
Β Β test_base_simulator.py
Β Β test_history_matching.py 44-58, 63-75
Project Total Β 

This report was generated by python-coverage-comment-action

@codecov-commenter
Copy link

codecov-commenter commented Apr 24, 2025

Codecov Report

Attention: Patch coverage is 20.09901% with 807 lines in your changes missing coverage. Please review.

Project coverage is 79.89%. Comparing base (23e751e) to head (2d732e1).

Files with missing lines Patch % Lines
autoemulate/history_matching_dashboard.py 0.00% 522 Missing ⚠️
autoemulate/simulations/circ_utils.py 0.00% 94 Missing ⚠️
...emulate/simulations/naghavi_cardiac_ModularCirc.py 0.00% 80 Missing ⚠️
autoemulate/history_matching.py 58.44% 64 Missing ⚠️
autoemulate/simulations/base.py 41.26% 37 Missing ⚠️
tests/test_history_matching.py 86.66% 10 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #423       +/-   ##
===========================================
- Coverage   90.17%   79.89%   -10.28%     
===========================================
  Files          97      100        +3     
  Lines        5994     6915      +921     
===========================================
+ Hits         5405     5525      +120     
- Misses        589     1390      +801     

β˜” View full report in Codecov by Sentry.
πŸ“’ Have feedback on the report? Share it here.

πŸš€ New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

"cell_type": "markdown",
"metadata": {},
"source": [
"#### 3 - Wrapping your Simulator in AutoEmulate Simulator Base Class\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this as straightforward for the user as possible, we could move the sample_inputs() method implementation from Naghavi to the Base Simulator. A user could always choose to override it but in most instances we want to do LatinHypercube sampling so it seems like a good default for all simulators to inherit.

If we do that, then we could simplify this section and simply say, the user has to subclass the Simulator class and implement the sample_forward method (I wouldn't even worry about mentioning _init_ here). I think that makes it really accessible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds very reasonable to me, I will wait for more comments on this, if none, I will apply these changes

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - this sounds great to have as a default to simplify subclassing and would not be unexpected.

"cell_type": "markdown",
"metadata": {},
"source": [
"#### 9 - use the interactive dashboard to inspect the results of history matching "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interactive dashboard doesn't display - is this something that we can fix?

If not, should we maybe comment out the code cell (unless there's a another way not to execute the code block) and just add a sentence above it that tells the user something like - if you are running this notebook interactively, you can view an interactive dashboard by uncommenting and running the below code. And/or could we maybe add a screenshot of it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an error ? it does display for me locally with no issues

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a screenshot for the static version and uncommenting (or a global constant at the top of the notebook?) to enable the dashboard when running locally. I also tried running locally and the dashboard displays for me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also relates to #361

Copy link
Member

@radka-j radka-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added some minor comments, mostly repeating what I already said in my notebook comments.

output_variables if output_variables is not None else []
)
self._output_names = [] # Will be populated after first simulation
self._has_sample_forward = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is _has_sample_forward used anywhere?

"\n",
"model = em.get_model(best_model['model'])\n",
"pred_mean, pred_std = _predict_with_optional_std(model, X)\n",
"pred_mean, pred_std = best_model.predict(X, return_std=True)\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this tutorial also used the HistoryMatcher class?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should , but should I just remove it from that tutorial ? is this an advance feature for 01_start ?

Copy link
Member

@radka-j radka-j Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, yes lets just remove it from the 01_start tutorial

else:
# Run actual simulation
outputs = self.simulator.sample_forward(params)
if outputs is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a user warning be raised here?

Copy link
Member

@radka-j radka-j Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this workflow graphic!

Having looked at this visualisation (which is very basic), I wonder whether it's worth distinguishing which bits of the pipeline are on the user (e.g., creating samples or providing data) whereas what the package does for them (e.g., select models). This could be done by using different shading for example.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, line 10 - "AutoEmulate" instead of "Autoemulate"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, line 12 - "Sensitivity Analysis"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be worth having the imports for the notebook all in a single cell after the ModularCirc pip install?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @marjanfamili and @radka-j - in this case it would be better to have each section self-contained to make usage clear.

@@ -666,7 +666,8 @@
}
Copy link
Collaborator

@ContiPaolo ContiPaolo Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #2.    em.evaluate(gp)

I liked having best_model(instead of specifying manually "GaussianProcess") as it showcases that AE automatically selects the best model


Reply via ReviewNB

Copy link
Collaborator

@cisprague cisprague left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job overall!

For after the MVP:
One potential area for improvement is performance: there may be opportunities to replace some for loops with vectorized operations using NumPy or PyTorch, which could speed things up a bit.

It looks like the current use of a dictionary helps keep axis names associated with the data, which is great for clarity. But, we might consider whether using a single tensor for the data, alongside a separate list of axis name strings, could make things more efficient, especially if this allows for more straightforward vectorized operations.

rank: Rank for history matching
"""
self.simulator = simulator
self.observations = observations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would this look like? {'var0': (0.0, 0.0), ..., 'varn': (0.0, 0.0)}? Maybe it would be more efficient to have the keys as a separate list of strings, corresponding the the axes of a tensor? If we have the observations as a single tensor, then vectorized operations might give us a speed up.

Copy link
Member

@radka-j radka-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!!

We agreed on 2 minor notebook changes:

  • add dashboard image to notebook
  • remove history matching from quick start tutorial

…or all plots, modified 01_start and removed history matching, its an advance feature and not needed in the start tutorial, updated history_matching itself to handle none outputs and failed attempts as it was interrupting the history matching waves. added more information to the tqdm print , to show more information about how many NROY points , and failed attempts
@marjanfamili marjanfamili merged commit 7d1fbc2 into main Apr 29, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants