Skip to content

Conversation

@joehart2001
Copy link
Collaborator

Pre-review checklist for PR author

PR author must check the checkboxes below when creating the PR.

Summary

phonon analysis and app, without the calc for the moment.

Linked issue

Resolves #222

Progress

  • Calculations
  • Analysis
  • Application
  • Documentation

Testing

mattersim and mace-omat-0

New decorators/callbacks

decorators:

  • cell_to_scatter: table cell -> scatter plot

callbacks:

  • for model-specific assets e.g. phonon dispersions. for weas, structures are model independent, meaning we need these new callbacks to propagate model specific info.
    - scatter_and_assets_from_table
    - model_asset_from_scatter

i think theres potential to combine these with other callbacks so lets dicuss? e.g. make table -> scatter and scatter to structure more general (e.g. scatter -> any asset). but maybe we want to keep it separate so its more simple for users

@joehart2001 joehart2001 added the new benchmark Proposals and suggestions for new benchmarks label Dec 17, 2025
cumulative_dist, i = 0.0, 0
connections = [True] + connections

for seg_dist, connected in zip(distances, connections, strict=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With strict=False this will stop when either of the two lists ends, even if the other is longer. Is this what you want?

return Div("Click on a metric to view the structure.")


def scatter_and_assets_from_table(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this by handed by something like plot_from_table_cell?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also keeps the model data so that we can access model-specific assets from the scatter plot. (with weas, we dont take model-specific assets as the structures are all the same (right?))

return content, meta, active_cell


def model_asset_from_scatter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the model-specific part come in?

As above, I also wonder if asset it the most useful description? Is this actually loading from /assets?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model context is carried via scatter_metadata: when the user clicks a table cell, scatter_and_assets_from_table stores the active model, metric, and the list of points.

asset is just a general way of saying "secondary view component" e.g. scatter -> asset e.g. scatter -> structure or png or other. i was trying to make a general. (wouldnt use for strucutures here as model independent, but maybe if we had specific relaxed structures which were model dependent then we could use this for that too)

return plot_parity_decorator


def cell_to_scatter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to make this an option for the exist scatter decorator, rather than a separate one? I think basically all the logic is the same, it's just whether we combine the traces, right? We may even want to do individual + combined increasingly e.g. when scaling starts to get messy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we build these as part of the app, rather than saving them to json and loading them as we normally do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save on analysis time

Parameters
----------
points
Sequence of metadata dictionaries containing reference/prediction values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing because it makes it sound like data rather than metadata, but I think it's because we save various things (including but not limited to the x/y values) of each point in these dicts? Probably worth a making the connection to "points" in this docstring too, since out of context it's quite confusing

@joehart2001
Copy link
Collaborator Author

Issue to fix: when rows are reordered, the interactive cells show the original model in that position

@joehart2001
Copy link
Collaborator Author

So i've gotten to the bottom of the band mismatches.

  • In the calculation, when the primitive cell isnt present in the ref data, i use primitive cell "auto", which works in most cases, but sometimes the auto algorithm predicts a different number of atoms (so multiple cells).
  • This means there are more atoms and as no. branches = 3N (N = atoms) then we get a mismatch in the number of branches for the BZ MAE (but not the k-points as this is based off the ref always).
  • for mp-0a it was 38 samples which got this error so weren't included in the BZ MAE calculation, but i dont think this effects any other metrics.

__pycache__/
ml_peg/app/data/
ml_peg/app/data/*
!ml_peg/app/data/onboarding/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's a tiny change, but can we do this separately e.g. its own PR? It's unrelated to phonons

@joehart2001
Copy link
Collaborator Author

So i've gotten to the bottom of the band mismatches.

  • In the calculation, when the primitive cell isnt present in the ref data, i use primitive cell "auto", which works in most cases, but sometimes the auto algorithm predicts a different number of atoms (so multiple cells).
  • This means there are more atoms and as no. branches = 3N (N = atoms) then we get a mismatch in the number of branches for the BZ MAE (but not the k-points as this is based off the ref always).
  • for mp-0a it was 38 samples which got this error so weren't included in the BZ MAE calculation, but i dont think this effects any other metrics.

So a fix that works for some test samples is instead of auto, just calculating the primitive matrix:

                unitcell = phonons_pred.unitcell
                primitive_cell = phonons_pred.primitive
                primitive_matrix = np.linalg.inv(np.array(unitcell.get_cell())) @ np.array(primitive_cell.get_cell())

We probably need to rerun some phonons, but using these larger cells we used before doenst effect the max or min freqs etc, mainly the BZ MAE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new benchmark Proposals and suggestions for new benchmarks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

phonons analysis and app

3 participants