Skip to content

Internalize ert_templates into storage #10617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 23, 2025
Merged

Conversation

xjules
Copy link
Contributor

@xjules xjules commented Apr 11, 2025

Issue
This is a necessary pre-work related to #10180

All templates will be part of storage and thus for instance restart would
rely only on templates from the storage. Additionally, this removes templates
param from create_run_path, but it needs to be specified directly when calling
create_experiment.
Templates are stored in the experiment.mount_point / templates folder
Example: [["templates/seed_template_0.txt", "seed.txt"]]

It requires storage migration for run_templates.

(Screenshot of new behavior in GUI if applicable)

  • PR title captures the intent of the changes, and is fitting for release notes.
  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.
  • Make sure unit tests pass locally after every commit (git rebase -i main --exec 'just rapid-tests')

When applicable

  • When there are user facing changes: Updated documentation
  • New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
  • Large PR: Prepare changes in small commits for more convenient review
  • Bug fix: Add regression test for the bug
  • Bug fix: Add backport label to latest release (format: 'backport release-branch-name')

@xjules xjules self-assigned this Apr 11, 2025
@xjules xjules added this to SCOUT Apr 11, 2025
@xjules xjules moved this to In Progress in SCOUT Apr 11, 2025
Copy link

codspeed-hq bot commented Apr 11, 2025

CodSpeed Performance Report

Merging #10617 will not alter performance

Comparing xjules:storage_template (83152de) with main (2e9a852)

Summary

✅ 25 untouched benchmarks

@xjules xjules marked this pull request as ready for review April 13, 2025 21:11
@xjules xjules changed the title Internalize ert_tepmlates into storage Internalize ert_templates into storage Apr 13, 2025
@xjules xjules added release-notes:misc Automatically categorise as miscellaneous change in release notes improvement Something nice to have, that will make life easier for developers or users or both. labels Apr 15, 2025
@@ -248,6 +265,19 @@ def parameter_info(self) -> dict[str, Any]:
info = json.load(f)
return info

@cached_property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this as cached_property? The others are just property

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, if you check parameter_configuration for instance. We don't want to access the files all the time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probaby dont want this as a cached property now as we are keeping file content in memory?

Copy link
Contributor

@jonathan-eq jonathan-eq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicking 👍

@xjules xjules force-pushed the storage_template branch from d36987e to c5ffebf Compare April 15, 2025 12:00
@xjules xjules moved this from In Progress to Ready for Review in SCOUT Apr 15, 2025
@xjules xjules force-pushed the storage_template branch from c8b8c3b to c1da5e1 Compare April 15, 2025 21:05
@xjules
Copy link
Contributor Author

xjules commented Apr 16, 2025

Semeio failure relates to something else, most likely SmootherUpdate API change.

Copy link
Collaborator

@oyvindeide oyvindeide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Some minor comments, and one corner case

from ..plugins.workflow_fixtures import (
create_workflow_fixtures_from_hooked,
)
from ..plugins.workflow_fixtures import create_workflow_fixtures_from_hooked
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont mind this change, but should be in a different commit as it is not related to the main change?

Copy link
Contributor Author

@xjules xjules Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will create a new commit for these imports...
I have removed the changes to make the PR more compact.

EverestStorage,
OptimalResult,
)
from everest.everest_storage import EverestStorage, OptimalResult
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, fine to do, but should be its own commit

EverestCacheHitEvent,
EverestStatusEvent,
)
from .event import EverestBatchResultEvent, EverestCacheHitEvent, EverestStatusEvent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Comment on lines 294 to 298
result_type=(
"FunctionResult"
if isinstance(r, FunctionResults)
else "GradientResult"
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

Comment on lines 690 to 694
perturbations=(
evaluator_context.perturbations.tolist()
if evaluator_context.perturbations is not None
else [-1] * len(model_realizations)
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

ESSettings,
UpdateSettings,
)
from ert.config import ConfigValidationError, ErtConfig, ESSettings, UpdateSettings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

for src, dst in templates:
incoming_template_file_path = Path(src)
template_file_path = Path(
templates_path / incoming_template_file_path.name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail in cases where I have different templates with the same name, for example snake oil:

RUN_TEMPLATE templates/seed_template.txt seed.txt
RUN_TEMPLATE seed_template.txt seed_2.txt

valid input, but it will overwrite the template.

A bit of a corner case, but something to consider? Could hash the input file name for example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed now. Each template is stored as relative path with an index enumerating the list:

 [["templates/seed_template_0.txt", "seed.txt"]]

@xjules xjules force-pushed the storage_template branch 2 times, most recently from 6db164f to b714e1a Compare April 23, 2025 00:17
@xjules xjules requested a review from oyvindeide April 23, 2025 10:44
@xjules xjules force-pushed the storage_template branch 2 times, most recently from c0102f2 to 1723f64 Compare April 23, 2025 12:58
Copy link
Collaborator

@oyvindeide oyvindeide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Some memory questions though

for (
source_file_content,
target_file,
) in ensemble.experiment.templates_configuration:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are loading all the template files at the same time now, but should be ok memory wise I guess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be loaded on the fly now.

@@ -248,6 +265,19 @@ def parameter_info(self) -> dict[str, Any]:
info = json.load(f)
return info

@cached_property
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probaby dont want this as a cached property now as we are keeping file content in memory?

@github-project-automation github-project-automation bot moved this from Ready for Review to Reviewed in SCOUT Apr 23, 2025
 All templates will be part of storage and thus for instance restart would
 rely only on templates from the storage. Additionally, this removes templates
 param from create_run_path, but it needs to be specified directly when calling
 create_experiment.
 Templates are stored in the experiment.mount_point / templates folder
 Example: [["templates/seed_template_0.txt", "seed.txt"]]

 It requires storage migration for run_templates.
@xjules xjules force-pushed the storage_template branch from 1723f64 to 83152de Compare April 23, 2025 14:14
@xjules xjules merged commit 469cae2 into equinor:main Apr 23, 2025
27 checks passed
@github-project-automation github-project-automation bot moved this from Reviewed to Done in SCOUT Apr 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Something nice to have, that will make life easier for developers or users or both. release-notes:misc Automatically categorise as miscellaneous change in release notes
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants