Skip to content

(4) Feature/SOF-7846 update: Adjust Band Gap NB#276

Merged
VsevolodX merged 5 commits intomainfrom
feature/SOF-7846
Mar 13, 2026
Merged

(4) Feature/SOF-7846 update: Adjust Band Gap NB#276
VsevolodX merged 5 commits intomainfrom
feature/SOF-7846

Conversation

@VsevolodX
Copy link
Member

@VsevolodX VsevolodX commented Mar 12, 2026

Summary by CodeRabbit

  • Documentation

    • Restructured band gap workflow notebook with improved step-by-step guide and usage instructions.
    • Enhanced configuration parameters for easier customization of calculations.
  • Improvements

    • Updated visualization and result retrieval for band gap data.
    • Added enhanced logging output for cluster and queue configuration details.
    • Migrated to standard Python 3 environment support.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@VsevolodX VsevolodX changed the title update: Add Band Gap NB update: Adjust Band Gap NB Mar 12, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

The Band Gap workflow notebook undergoes comprehensive restructuring, transitioning from a process-driven approach with hardcoded constants to a usage-focused guide with parameterized configuration. The implementation migrates material, workflow, and compute handling from local/direct methods to API-based alternatives, while modernizing configuration providers for k-grid, cutoffs, and model management.

Changes

Cohort / File(s) Summary
Band Gap Workflow Notebook Restructuring
other/materials_designer/workflows/band_gap.ipynb
Comprehensive refactor introducing parameterized constants (CALCULATION_TYPE, APPLICATION_NAME, ADD_RELAXATION, WORKFLOW_SEARCH_TERM, etc.). Migrates material/workflow creation from local Standata to API-driven get_or_create_material/get_or_create_workflow. Updates compute configuration from direct constants to Compute objects with cluster/queue/ppn parameters. Replaces k-grid/cutoff handling with context providers (PlanewaveCutoffsContextProvider, PointsGridDataProvider) applied across relaxation, SCF, NSCF subworkflows. Reworks model configuration via ModelTreeStandata with pseudopotential method specification. Removes QueueName import; updates result retrieval with new property paths and visualization signatures. Refreshes notebook metadata and adds structured Usage/Summary section.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • timurbazhirov

Poem

🐰 Hops with delight through API pathways clear,
Parameters dance where constants once were here,
Band gaps now flow from cloud and context wise,
A workflow reborn beneath cleaner skies!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title refers to a specific issue (SOF-7846) and indicates adjustment to the Band Gap notebook, which aligns with the primary change (restructuring the band_gap.ipynb notebook).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/SOF-7846
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Ruff (0.15.5)
other/materials_designer/workflows/band_gap.ipynb

Unexpected end of JSON input


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@VsevolodX VsevolodX changed the title update: Adjust Band Gap NB Feature/SOF-7846 update: Adjust Band Gap NB Mar 12, 2026
@VsevolodX VsevolodX changed the title Feature/SOF-7846 update: Adjust Band Gap NB (4) Feature/SOF-7846 update: Adjust Band Gap NB Mar 13, 2026
@@ -5,27 +5,36 @@
"id": "0",
Copy link
Member

@timurbazhirov timurbazhirov Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #9.    #     functional="pbe" # e.g., "pz" (LDA), "pbe" (GGA)

We can add these as parameters above


Reply via ReviewNB

Base automatically changed from feature/SOF-7847 to main March 13, 2026 21:26
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
other/materials_designer/workflows/band_gap.ipynb (2)

453-468: ⚠️ Potential issue | 🟠 Major

This step does not save the configured workflow.

get_or_create_workflow() in utils/api.py:155-174 persists workflow.to_dict_without_special_keys(), so the collection item saved here drops the model and unit-level context set in Sections 4.4-4.5. The job later uses the in-memory workflow, which means the printed workflow ID is not the fully configured workflow the notebook just previewed.

💡 Suggested fix
 saved_workflow_response = get_or_create_workflow(client, workflow, ACCOUNT_ID)
 saved_workflow = Workflow.create(saved_workflow_response)
-print(f"Workflow ID: {saved_workflow.id}")
+print(
+    "Workflow template saved to collection. "
+    "The job below still uses the fully configured in-memory workflow."
+)
+print(f"Workflow template ID: {saved_workflow.id}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/band_gap.ipynb` around lines 453 - 468,
The saved collection item currently drops model/unit-level context because
get_or_create_workflow (utils/api.py) persists
workflow.to_dict_without_special_keys; fix by ensuring the full configured
workflow is persisted: either change the call here to pass a full-serialization
(e.g., workflow.to_dict() or workflow.to_dict_with_special_keys if available)
into get_or_create_workflow, or update get_or_create_workflow to accept an
optional flag to persist special keys and use that flag when called from this
notebook; reference saved_workflow_response, saved_workflow,
get_or_create_workflow, and workflow.to_dict_without_special_keys when making
the change.

487-518: ⚠️ Potential issue | 🟠 Major

Validate cluster selection before building Compute.

This path has two hard failures: clusters[0] on Line 511 blows up when the account has no clusters, and next(..., None) accepts a typo or ambiguous partial match and defers the error until later. Fail fast with an explicit match check.

💡 Suggested fix
 clusters = client.clusters.list()
 print(f"Available clusters: {[c['hostname'] for c in clusters]}")
+
+if not clusters:
+    raise RuntimeError("No clusters are available for the selected account.")
 
 if CLUSTER_NAME:
-    cluster = next((c for c in clusters if CLUSTER_NAME in c["hostname"]), None)
+    matches = [c for c in clusters if CLUSTER_NAME in c["hostname"]]
+    if not matches:
+        raise ValueError(
+            f"No cluster matched '{CLUSTER_NAME}'. Available: {[c['hostname'] for c in clusters]}"
+        )
+    if len(matches) > 1:
+        raise ValueError(
+            f"CLUSTER_NAME '{CLUSTER_NAME}' is ambiguous. Matches: {[c['hostname'] for c in matches]}"
+        )
+    cluster = matches[0]
 else:
     cluster = clusters[0]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/band_gap.ipynb` around lines 487 - 518,
The code currently assumes clusters is non-empty and uses next(..., None) which
can silently accept partial/typo matches; update the cluster selection before
constructing Compute to validate: first check that clusters is not empty and
raise/print a clear error if it is; if CLUSTER_NAME is set, find an exact
hostname match (e.g., compare c["hostname"] == CLUSTER_NAME) and if none is
found raise/print an explicit error listing available cluster hostnames; only
then set cluster (or default to clusters[0] when CLUSTER_NAME is not provided
and clusters exists) and proceed to construct Compute with the validated cluster
variable.
🧹 Nitpick comments (1)
other/materials_designer/workflows/band_gap.ipynb (1)

399-404: Avoid pinning the band-gap branch to a fixed subworkflow index.

Line 399 assumes add_relaxation() always inserts exactly one subworkflow at the front. That makes the later SCF/NSCF edits template-order dependent; resolving the target subworkflow and relaxation unit by contents is more robust.

💡 Suggested refactor
-bg_subworkflow = workflow.subworkflows[1 if ADD_RELAXATION else 0]
+bg_subworkflow = next(
+    (
+        swf
+        for swf in workflow.subworkflows
+        if swf.get_unit_by_name(name="pw_scf") or swf.get_unit_by_name(name="pw_scf_hse")
+    ),
+    None,
+)
+if bg_subworkflow is None:
+    raise RuntimeError("Could not locate the band-gap subworkflow.")
 
 if RELAXATION_KGRID is not None and ADD_RELAXATION:
     unit = workflow.subworkflows[0].get_unit_by_name(name_regex="relax")
+    if unit is None:
+        raise RuntimeError("Could not locate the relaxation unit.")
     unit.add_context(PointsGridDataProvider(dimensions=RELAXATION_KGRID, isEdited=True).yield_data())
     workflow.subworkflows[0].set_unit(unit)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/band_gap.ipynb` around lines 399 - 404,
The code currently assumes the relaxation subworkflow is at a fixed index (used
when setting bg_subworkflow and when fetching workflow.subworkflows[0]), which
is brittle; instead locate the relaxation subworkflow and unit by content:
search workflow.subworkflows for the subworkflow whose units contain a unit
matching get_unit_by_name(name_regex="relax") (or whose name/property indicates
relaxation), assign that found subworkflow to a variable (instead of using index
0 or 1), then call
unit.add_context(PointsGridDataProvider(dimensions=RELAXATION_KGRID,
isEdited=True).yield_data()) and workflow.subworkflows.set_unit(...) on that
located subworkflow/unit; keep the ADD_RELAXATION and RELAXATION_KGRID guards
but replace index-based accesses to use this content-based lookup (referencing
bg_subworkflow, workflow.subworkflows, get_unit_by_name, add_context,
PointsGridDataProvider, ADD_RELAXATION, RELAXATION_KGRID).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@other/materials_designer/workflows/band_gap.ipynb`:
- Around line 228-230: The code assumes client.projects.list({"isDefault": True,
"owner._id": ACCOUNT_ID}) returns a non-empty list and directly uses projects[0]
(and assigns project_id), which can raise IndexError; update the logic around
the projects variable (the client.projects.list call and usage of projects[0] /
project_id) to first check whether projects is non-empty and, if empty, surface
a clear recovery message (e.g., instruct the user to create a default project or
select an existing project) or programmatically create/fallback to a project,
then proceed to set project_id and print; ensure all references to projects[0]
are guarded so the notebook fails gracefully with actionable instructions
instead of an IndexError.

---

Outside diff comments:
In `@other/materials_designer/workflows/band_gap.ipynb`:
- Around line 453-468: The saved collection item currently drops
model/unit-level context because get_or_create_workflow (utils/api.py) persists
workflow.to_dict_without_special_keys; fix by ensuring the full configured
workflow is persisted: either change the call here to pass a full-serialization
(e.g., workflow.to_dict() or workflow.to_dict_with_special_keys if available)
into get_or_create_workflow, or update get_or_create_workflow to accept an
optional flag to persist special keys and use that flag when called from this
notebook; reference saved_workflow_response, saved_workflow,
get_or_create_workflow, and workflow.to_dict_without_special_keys when making
the change.
- Around line 487-518: The code currently assumes clusters is non-empty and uses
next(..., None) which can silently accept partial/typo matches; update the
cluster selection before constructing Compute to validate: first check that
clusters is not empty and raise/print a clear error if it is; if CLUSTER_NAME is
set, find an exact hostname match (e.g., compare c["hostname"] == CLUSTER_NAME)
and if none is found raise/print an explicit error listing available cluster
hostnames; only then set cluster (or default to clusters[0] when CLUSTER_NAME is
not provided and clusters exists) and proceed to construct Compute with the
validated cluster variable.

---

Nitpick comments:
In `@other/materials_designer/workflows/band_gap.ipynb`:
- Around line 399-404: The code currently assumes the relaxation subworkflow is
at a fixed index (used when setting bg_subworkflow and when fetching
workflow.subworkflows[0]), which is brittle; instead locate the relaxation
subworkflow and unit by content: search workflow.subworkflows for the
subworkflow whose units contain a unit matching
get_unit_by_name(name_regex="relax") (or whose name/property indicates
relaxation), assign that found subworkflow to a variable (instead of using index
0 or 1), then call
unit.add_context(PointsGridDataProvider(dimensions=RELAXATION_KGRID,
isEdited=True).yield_data()) and workflow.subworkflows.set_unit(...) on that
located subworkflow/unit; keep the ADD_RELAXATION and RELAXATION_KGRID guards
but replace index-based accesses to use this content-based lookup (referencing
bg_subworkflow, workflow.subworkflows, get_unit_by_name, add_context,
PointsGridDataProvider, ADD_RELAXATION, RELAXATION_KGRID).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 205ea292-0283-4bd5-921b-1ac8cb8512fe

📥 Commits

Reviewing files that changed from the base of the PR and between da107a6 and c9edd50.

📒 Files selected for processing (1)
  • other/materials_designer/workflows/band_gap.ipynb

Comment on lines +228 to +230
"projects = client.projects.list({\"isDefault\": True, \"owner._id\": ACCOUNT_ID})\n",
"project_id = projects[0][\"_id\"]\n",
"print(f\"\\u2705 Using project: {projects[0]['name']} ({project_id})\")"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard the default-project lookup.

projects[0] on Line 229 assumes every selected account already has a default project. If that query comes back empty, the notebook dies with IndexError instead of telling the user how to recover.

💡 Suggested fix
 projects = client.projects.list({"isDefault": True, "owner._id": ACCOUNT_ID})
-project_id = projects[0]["_id"]
-print(f"✅ Using project: {projects[0]['name']} ({project_id})")
+if not projects:
+    raise RuntimeError(
+        f"No default project found for account '{selected_account.name}'. "
+        "Create one first or select a different account."
+    )
+
+project_id = projects[0]["_id"]
+print(f"✅ Using project: {projects[0]['name']} ({project_id})")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@other/materials_designer/workflows/band_gap.ipynb` around lines 228 - 230,
The code assumes client.projects.list({"isDefault": True, "owner._id":
ACCOUNT_ID}) returns a non-empty list and directly uses projects[0] (and assigns
project_id), which can raise IndexError; update the logic around the projects
variable (the client.projects.list call and usage of projects[0] / project_id)
to first check whether projects is non-empty and, if empty, surface a clear
recovery message (e.g., instruct the user to create a default project or select
an existing project) or programmatically create/fallback to a project, then
proceed to set project_id and print; ensure all references to projects[0] are
guarded so the notebook fails gracefully with actionable instructions instead of
an IndexError.

@VsevolodX VsevolodX merged commit ba36044 into main Mar 13, 2026
5 checks passed
@VsevolodX VsevolodX deleted the feature/SOF-7846 branch March 13, 2026 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants