History matching refine model #423

marjanfamili · 2025-04-24T11:27:18Z

📝 Description

This pull request introduces

workflow for user-provided simulator
NaghaviSimulator
** changes to base simulator**
HistoryMatcher
HistoryMatchingDashboard

👩‍💻 Usage

follow the demo 07_AE_workflow.ipynb

🆘 Help Needed

Need to implement calibration method for completing the pipeline
Need a test which specifically tests the simulator written by the user. This should not be included in the tests folder as it is not an autoemulate test. maybe a test in the simulations folder

🔗 Related Issue

#294 #412

🛠 Type of Change

✨ New feature
📖 Documentation update
♻️ Refactor/Code cleanup
🧪 Test updates
🎓 Tutorial added (new guides, walkthroughs, or examples)

✅ Checklist

My code follows the project's coding style 🎨
I have tested my changes locally 🖥️
I have added/updated unit tests (if applicable) 🧪
I have updated the documentation (README, comments, etc.) if needed 📚
My changes generate no new warnings or errors ⚠️❌

🖼️ Screenshots (if applicable)

💬 Additional Notes

Reviewers:
👀 Pay special attention to:

the change to the simulation base class
updates to the hoiitory matching pipeline
the history matching test has been re-written

…I will return to this

… the base class of the simulator

review-notebook-app · 2025-04-24T11:27:23Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

github-actions · 2025-04-24T14:18:44Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
autoemulate
history_matching.py					43, 76, 110-127, 164, 209-223, 233-234, 247-251, 265, 279-299, 311, 319, 350-419
history_matching_dashboard.py					1-1241
autoemulate/emulators
gaussian_process.py
autoemulate/simulations
base.py					60, 86-97, 111-141, 156-172, 176, 199-205
circ_utils.py					4-233
naghavi_cardiac_ModularCirc.py					1-172
tests
test_base_simulator.py
test_history_matching.py					44-58, 63-75
Project Total

_{This report was generated by python-coverage-comment-action}

codecov-commenter · 2025-04-24T14:19:45Z

Codecov Report

Attention: Patch coverage is 20.09901% with 807 lines in your changes missing coverage. Please review.

Project coverage is 79.89%. Comparing base (23e751e) to head (2d732e1).
Report is 362 commits behind head on main.

Files with missing lines	Patch %	Lines
autoemulate/history_matching_dashboard.py	0.00%	522 Missing ⚠️
autoemulate/simulations/circ_utils.py	0.00%	94 Missing ⚠️
...emulate/simulations/naghavi_cardiac_ModularCirc.py	0.00%	80 Missing ⚠️
autoemulate/history_matching.py	58.44%	64 Missing ⚠️
autoemulate/simulations/base.py	41.26%	37 Missing ⚠️
tests/test_history_matching.py	86.66%	10 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #423       +/-   ##
===========================================
- Coverage   90.17%   79.89%   -10.28%     
===========================================
  Files          97      100        +3     
  Lines        5994     6915      +921     
===========================================
+ Hits         5405     5525      +120     
- Misses        589     1390      +801

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

docs/tutorials/07_AE_workflow.ipynb

radka-j · 2025-04-25T09:04:15Z

docs/tutorials/07_AE_workflow.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 3 - Wrapping your Simulator in AutoEmulate Simulator Base Class\n",


To make this as straightforward for the user as possible, we could move the sample_inputs() method implementation from Naghavi to the Base Simulator. A user could always choose to override it but in most instances we want to do LatinHypercube sampling so it seems like a good default for all simulators to inherit.

If we do that, then we could simplify this section and simply say, the user has to subclass the Simulator class and implement the sample_forward method (I wouldn't even worry about mentioning _init_ here). I think that makes it really accessible.

This sounds very reasonable to me, I will wait for more comments on this, if none, I will apply these changes

Agreed - this sounds great to have as a default to simplify subclassing and would not be unexpected.

docs/tutorials/07_AE_workflow.ipynb

radka-j · 2025-04-25T09:25:56Z

docs/tutorials/07_AE_workflow.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 9 - use the interactive dashboard to inspect the results of history matching "


The interactive dashboard doesn't display - is this something that we can fix?

If not, should we maybe comment out the code cell (unless there's a another way not to execute the code block) and just add a sentence above it that tells the user something like - if you are running this notebook interactively, you can view an interactive dashboard by uncommenting and running the below code. And/or could we maybe add a screenshot of it?

Is there an error ? it does display for me locally with no issues

I like the idea of a screenshot for the static version and uncommenting (or a global constant at the top of the notebook?) to enable the dashboard when running locally. I also tried running locally and the dashboard displays for me.

Also relates to #361

Co-authored-by: Radka Jersakova <[email protected]>

radka-j

I only added some minor comments, mostly repeating what I already said in my notebook comments.

radka-j · 2025-04-28T11:52:17Z

autoemulate/simulations/base.py

+            output_variables if output_variables is not None else []
+        )
+        self._output_names = []  # Will be populated after first simulation
+        self._has_sample_forward = False


Is _has_sample_forward used anywhere?

autoemulate/simulations/base.py

radka-j · 2025-04-28T12:00:38Z

docs/tutorials/01_start.ipynb

    "\n",
-    "model = em.get_model(best_model['model'])\n",
-    "pred_mean, pred_std = _predict_with_optional_std(model, X)\n",
+    "pred_mean, pred_std = best_model.predict(X, return_std=True)\n",


Should this tutorial also used the HistoryMatcher class?

yes it should , but should I just remove it from that tutorial ? is this an advance feature for 01_start ?

good point, yes lets just remove it from the 01_start tutorial

radka-j · 2025-04-28T13:53:01Z

autoemulate/history_matching.py

+            else:
+                # Run actual simulation
+                outputs = self.simulator.sample_forward(params)
+                if outputs is None:


Should a user warning be raised here?

…data/naghavi_model_parameters

radka-j · 2025-04-28T15:18:23Z

misc/workflow.png

I really like this workflow graphic!

Having looked at this visualisation (which is very basic), I wonder whether it's worth distinguishing which bits of the pipeline are on the user (e.g., creating samples or providing data) whereas what the package does for them (e.g., select models). This could be done by using different shading for example.

sgreenbury · 2025-04-28T15:27:56Z

docs/tutorials/07_AE_workflow.ipynb

Typo, line 10 - "AutoEmulate" instead of "Autoemulate"

Typo, line 12 - "Sensitivity Analysis"

Perhaps it would be worth having the imports for the notebook all in a single cell after the ModularCirc pip install?

Discussed with @marjanfamili and @radka-j - in this case it would be better to have each section self-contained to make usage clear.

ContiPaolo · 2025-04-28T15:46:33Z

docs/tutorials/03_emulation_sensitivity.ipynb

@@ -666,7 +666,8 @@
    }


Line #2. em.evaluate(gp)
I liked having best_model(instead of specifying manually "GaussianProcess") as it showcases that AE automatically selects the best model

Reply via ReviewNB

cisprague

Nice job overall!

For after the MVP:
One potential area for improvement is performance: there may be opportunities to replace some for loops with vectorized operations using NumPy or PyTorch, which could speed things up a bit.

It looks like the current use of a dictionary helps keep axis names associated with the data, which is great for clarity. But, we might consider whether using a single tensor for the data, alongside a separate list of axis name strings, could make things more efficient, especially if this allows for more straightforward vectorized operations.

cisprague · 2025-04-29T08:41:05Z

autoemulate/history_matching.py

+            rank: Rank for history matching
+        """
+        self.simulator = simulator
+        self.observations = observations


What would this look like? {'var0': (0.0, 0.0), ..., 'varn': (0.0, 0.0)}? Maybe it would be more efficient to have the keys as a separate list of strings, corresponding the the axes of a tensor? If we have the observations as a single tensor, then vectorized operations might give us a speed up.

…:)))

… more fearures added, also removed show NROY button for the plots that is not rlevant for

radka-j

Great stuff!!

We agreed on 2 minor notebook changes:

add dashboard image to notebook
remove history matching from quick start tutorial

…or all plots, modified 01_start and removed history matching, its an advance feature and not needed in the start tutorial, updated history_matching itself to handle none outputs and failed attempts as it was interrupting the history matching waves. added more information to the tqdm print , to show more information about how many NROY points , and failed attempts

marjanfamili added 15 commits March 31, 2025 11:52

some ideas

085b1b9

working on simulator

233f8bc

function working but needs work

2c6b693

moving things around to work towards having tutorial

8536248

moved simulator to new file and had to copy a file from modularcirc, …

b86fb13

…I will return to this

Merge branch 'Preprocessing' into history_matching_refine_model

6b9ad6a

working historymatching and plotting

f69c60f

added plotting functions to history_matching_visualisation.py

c75dc03

dashboard , vis and precommit

03e35d0

dashboard , vis and precommit

0233333

base class temp

28393db

make it ocmpatible with new simulator

7221f87

added workflow fig , pushed some of the methods from the simulator to…

bb9e0a2

… the base class of the simulator

Merge branch 'main' into history_matching_refine_model

6f00120

changed simulator to match base simulator, changed name of tutorial file

740888f

marjanfamili added 3 commits April 24, 2025 14:18

updated history matching test to use mock simulator from base

f587a31

update the mock simulator in the simulator base test

d87ba45

fixed broken test for base simulator

7357aa6

marjanfamili added 2 commits April 24, 2025 15:24

adding to the documentation of the demo

3515146

more documentation added

edb229a

marjanfamili requested review from cisprague, radka-j, sgreenbury and ContiPaolo April 24, 2025 14:58

removed evaluate after refit, created issue to raise this

3d7b9fd

marjanfamili requested a review from edwardchalstrey1 April 24, 2025 15:21