(4) Feature/SOF-7846 update: Adjust Band Gap NB#276
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
📝 WalkthroughWalkthroughThe Band Gap workflow notebook undergoes comprehensive restructuring, transitioning from a process-driven approach with hardcoded constants to a usage-focused guide with parameterized configuration. The implementation migrates material, workflow, and compute handling from local/direct methods to API-based alternatives, while modernizing configuration providers for k-grid, cutoffs, and model management. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Ruff (0.15.5)other/materials_designer/workflows/band_gap.ipynbUnexpected end of JSON input Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| @@ -5,27 +5,36 @@ | |||
| "id": "0", | |||
There was a problem hiding this comment.
Line #9. # functional="pbe" # e.g., "pz" (LDA), "pbe" (GGA)
We can add these as parameters above
Reply via ReviewNB
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
other/materials_designer/workflows/band_gap.ipynb (2)
453-468:⚠️ Potential issue | 🟠 MajorThis step does not save the configured workflow.
get_or_create_workflow()inutils/api.py:155-174persistsworkflow.to_dict_without_special_keys(), so the collection item saved here drops the model and unit-level context set in Sections 4.4-4.5. The job later uses the in-memoryworkflow, which means the printed workflow ID is not the fully configured workflow the notebook just previewed.💡 Suggested fix
saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID) saved_workflow = Workflow.create(saved_workflow_response) -print(f"Workflow ID: {saved_workflow.id}") +print( + "Workflow template saved to collection. " + "The job below still uses the fully configured in-memory workflow." +) +print(f"Workflow template ID: {saved_workflow.id}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/band_gap.ipynb` around lines 453 - 468, The saved collection item currently drops model/unit-level context because get_or_create_workflow (utils/api.py) persists workflow.to_dict_without_special_keys; fix by ensuring the full configured workflow is persisted: either change the call here to pass a full-serialization (e.g., workflow.to_dict() or workflow.to_dict_with_special_keys if available) into get_or_create_workflow, or update get_or_create_workflow to accept an optional flag to persist special keys and use that flag when called from this notebook; reference saved_workflow_response, saved_workflow, get_or_create_workflow, and workflow.to_dict_without_special_keys when making the change.
487-518:⚠️ Potential issue | 🟠 MajorValidate cluster selection before building
Compute.This path has two hard failures:
clusters[0]on Line 511 blows up when the account has no clusters, andnext(..., None)accepts a typo or ambiguous partial match and defers the error until later. Fail fast with an explicit match check.💡 Suggested fix
clusters = client.clusters.list() print(f"Available clusters: {[c['hostname'] for c in clusters]}") + +if not clusters: + raise RuntimeError("No clusters are available for the selected account.") if CLUSTER_NAME: - cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None) + matches = [c for c in clusters if CLUSTER_NAME in c["hostname"]] + if not matches: + raise ValueError( + f"No cluster matched '{CLUSTER_NAME}'. Available: {[c['hostname'] for c in clusters]}" + ) + if len(matches) > 1: + raise ValueError( + f"CLUSTER_NAME '{CLUSTER_NAME}' is ambiguous. Matches: {[c['hostname'] for c in matches]}" + ) + cluster = matches[0] else: cluster = clusters[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/band_gap.ipynb` around lines 487 - 518, The code currently assumes clusters is non-empty and uses next(..., None) which can silently accept partial/typo matches; update the cluster selection before constructing Compute to validate: first check that clusters is not empty and raise/print a clear error if it is; if CLUSTER_NAME is set, find an exact hostname match (e.g., compare c["hostname"] == CLUSTER_NAME) and if none is found raise/print an explicit error listing available cluster hostnames; only then set cluster (or default to clusters[0] when CLUSTER_NAME is not provided and clusters exists) and proceed to construct Compute with the validated cluster variable.
🧹 Nitpick comments (1)
other/materials_designer/workflows/band_gap.ipynb (1)
399-404: Avoid pinning the band-gap branch to a fixed subworkflow index.Line 399 assumes
add_relaxation()always inserts exactly one subworkflow at the front. That makes the later SCF/NSCF edits template-order dependent; resolving the target subworkflow and relaxation unit by contents is more robust.💡 Suggested refactor
-bg_subworkflow = workflow.subworkflows[1 if ADD_RELAXATION else 0] +bg_subworkflow = next( + ( + swf + for swf in workflow.subworkflows + if swf.get_unit_by_name(name="pw_scf") or swf.get_unit_by_name(name="pw_scf_hse") + ), + None, +) +if bg_subworkflow is None: + raise RuntimeError("Could not locate the band-gap subworkflow.") if RELAXATION_KGRID is not None and ADD_RELAXATION: unit = workflow.subworkflows[0].get_unit_by_name(name_regex="relax") + if unit is None: + raise RuntimeError("Could not locate the relaxation unit.") unit.add_context(PointsGridDataProvider(dimensions=RELAXATION_KGRID, isEdited=True).yield_data()) workflow.subworkflows[0].set_unit(unit)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@other/materials_designer/workflows/band_gap.ipynb` around lines 399 - 404, The code currently assumes the relaxation subworkflow is at a fixed index (used when setting bg_subworkflow and when fetching workflow.subworkflows[0]), which is brittle; instead locate the relaxation subworkflow and unit by content: search workflow.subworkflows for the subworkflow whose units contain a unit matching get_unit_by_name(name_regex="relax") (or whose name/property indicates relaxation), assign that found subworkflow to a variable (instead of using index 0 or 1), then call unit.add_context(PointsGridDataProvider(dimensions=RELAXATION_KGRID, isEdited=True).yield_data()) and workflow.subworkflows.set_unit(...) on that located subworkflow/unit; keep the ADD_RELAXATION and RELAXATION_KGRID guards but replace index-based accesses to use this content-based lookup (referencing bg_subworkflow, workflow.subworkflows, get_unit_by_name, add_context, PointsGridDataProvider, ADD_RELAXATION, RELAXATION_KGRID).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@other/materials_designer/workflows/band_gap.ipynb`:
- Around line 228-230: The code assumes client.projects.list({"isDefault": True,
"owner._id": ACCOUNT_ID}) returns a non-empty list and directly uses projects[0]
(and assigns project_id), which can raise IndexError; update the logic around
the projects variable (the client.projects.list call and usage of projects[0] /
project_id) to first check whether projects is non-empty and, if empty, surface
a clear recovery message (e.g., instruct the user to create a default project or
select an existing project) or programmatically create/fallback to a project,
then proceed to set project_id and print; ensure all references to projects[0]
are guarded so the notebook fails gracefully with actionable instructions
instead of an IndexError.
---
Outside diff comments:
In `@other/materials_designer/workflows/band_gap.ipynb`:
- Around line 453-468: The saved collection item currently drops
model/unit-level context because get_or_create_workflow (utils/api.py) persists
workflow.to_dict_without_special_keys; fix by ensuring the full configured
workflow is persisted: either change the call here to pass a full-serialization
(e.g., workflow.to_dict() or workflow.to_dict_with_special_keys if available)
into get_or_create_workflow, or update get_or_create_workflow to accept an
optional flag to persist special keys and use that flag when called from this
notebook; reference saved_workflow_response, saved_workflow,
get_or_create_workflow, and workflow.to_dict_without_special_keys when making
the change.
- Around line 487-518: The code currently assumes clusters is non-empty and uses
next(..., None) which can silently accept partial/typo matches; update the
cluster selection before constructing Compute to validate: first check that
clusters is not empty and raise/print a clear error if it is; if CLUSTER_NAME is
set, find an exact hostname match (e.g., compare c["hostname"] == CLUSTER_NAME)
and if none is found raise/print an explicit error listing available cluster
hostnames; only then set cluster (or default to clusters[0] when CLUSTER_NAME is
not provided and clusters exists) and proceed to construct Compute with the
validated cluster variable.
---
Nitpick comments:
In `@other/materials_designer/workflows/band_gap.ipynb`:
- Around line 399-404: The code currently assumes the relaxation subworkflow is
at a fixed index (used when setting bg_subworkflow and when fetching
workflow.subworkflows[0]), which is brittle; instead locate the relaxation
subworkflow and unit by content: search workflow.subworkflows for the
subworkflow whose units contain a unit matching
get_unit_by_name(name_regex="relax") (or whose name/property indicates
relaxation), assign that found subworkflow to a variable (instead of using index
0 or 1), then call
unit.add_context(PointsGridDataProvider(dimensions=RELAXATION_KGRID,
isEdited=True).yield_data()) and workflow.subworkflows.set_unit(...) on that
located subworkflow/unit; keep the ADD_RELAXATION and RELAXATION_KGRID guards
but replace index-based accesses to use this content-based lookup (referencing
bg_subworkflow, workflow.subworkflows, get_unit_by_name, add_context,
PointsGridDataProvider, ADD_RELAXATION, RELAXATION_KGRID).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 205ea292-0283-4bd5-921b-1ac8cb8512fe
📒 Files selected for processing (1)
other/materials_designer/workflows/band_gap.ipynb
| "projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n", | ||
| "project_id = projects[0][\"_id\"]\n", | ||
| "print(f\"\\u2705 Using project: {projects[0]['name']} ({project_id})\")" |
There was a problem hiding this comment.
Guard the default-project lookup.
projects[0] on Line 229 assumes every selected account already has a default project. If that query comes back empty, the notebook dies with IndexError instead of telling the user how to recover.
💡 Suggested fix
projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
-project_id = projects[0]["_id"]
-print(f"✅ Using project: {projects[0]['name']} ({project_id})")
+if not projects:
+ raise RuntimeError(
+ f"No default project found for account '{selected_account.name}'. "
+ "Create one first or select a different account."
+ )
+
+project_id = projects[0]["_id"]
+print(f"✅ Using project: {projects[0]['name']} ({project_id})")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@other/materials_designer/workflows/band_gap.ipynb` around lines 228 - 230,
The code assumes client.projects.list({"isDefault": True, "owner._id":
ACCOUNT_ID}) returns a non-empty list and directly uses projects[0] (and assigns
project_id), which can raise IndexError; update the logic around the projects
variable (the client.projects.list call and usage of projects[0] / project_id)
to first check whether projects is non-empty and, if empty, surface a clear
recovery message (e.g., instruct the user to create a default project or select
an existing project) or programmatically create/fallback to a project, then
proceed to set project_id and print; ensure all references to projects[0] are
guarded so the notebook fails gracefully with actionable instructions instead of
an IndexError.
Summary by CodeRabbit
Documentation
Improvements