Error gap between train and valid sets #116

evenfarther · 2026-03-17T23:58:16Z

evenfarther
Mar 17, 2026

Dear UPET community,

I have recently become interested in UPET and have been studying various aspects of it.

I am currently training and validating a model "from scratch" using my own dataset. As shown in the attached image, I observed a significant and consistent gap between the train and valid sets for both force RMSE and force MAE. A similar trend was also observed in the energy RMSE.

Is this a commonly observed phenomenon in UPET? I would greatly appreciate any insights you could provide. For your reference, I have also attached the training options I used.

options.yaml

Additionally, I am testing the LAMMPS+KOKKOS implementation. Please note that since Blackwell GPUs (RTX PRO 6000 MAX-Q) are not officially supported by METATOMIC yet, I had to build it from source myself. Strangely, I found that its inference speed is slower compared to the Allegro+KOKKOS+Cuequivariance combination with a similar number of parameters. If anyone has tested this, I would be grateful if you could share your experiences regarding inference speeds across different MLIP frameworks.

Once again, thank you to the UPET team for providing such wonderful code!

Answered by cesaremalosso

Mar 18, 2026

Hi @evenfarther,
PET models tend to be very expressive, and the fact that the error on your training set is lower than on your validation set means that your model's accuracy is currently limited by data rather than architecture. This suggests it could be significantly improved by adding more data to your dataset. That said, the discrepancy is not dramatic, but the absolute error magnitudes indicate there's still room for improvement, have you considered fine-tuning one of our foundation models? You should be able to achieve much better results that way. Keep in mind that PET requires quite a few epochs (at least a couple of hundreds in my experience, but it depends on the dataset) to lea…

View full answer

cesaremalosso · 2026-03-18T09:20:00Z

cesaremalosso
Mar 18, 2026
Collaborator

Hi @evenfarther,
PET models tend to be very expressive, and the fact that the error on your training set is lower than on your validation set means that your model's accuracy is currently limited by data rather than architecture. This suggests it could be significantly improved by adding more data to your dataset. That said, the discrepancy is not dramatic, but the absolute error magnitudes indicate there's still room for improvement, have you considered fine-tuning one of our foundation models? You should be able to achieve much better results that way. Keep in mind that PET requires quite a few epochs (at least a couple of hundreds in my experience, but it depends on the dataset) to learn equivariance when training from scratch.

Regarding the inference speeds, I noticed you have 4 GNN layers, which is quite large, while setting d_pet, d_head, d_feedforward and d_node equal to 72, which is rather small. For better performance, I would suggest sticking to the default hyperparameters, and if accuracy is still insufficient, gradually increasing them.

1 reply

evenfarther Mar 18, 2026
Author

Hi @cesaremalosso, Thank you so much for your detailed response!

To give you a bit more context, my dataset consists of approximately 20,000 AIMD snapshots generated at 800K. When I checked the overall energy and force distributions, the differences between the train and valid sets were actually quite minimal. However, I did notice some outliers in both sets where the atomic forces exceeded 12 eV/A. I have since cleaned up the dataset by removing about 10 of these anomalous frames.

Following your advice regarding the architecture, I will try training from scratch again using the default hyperparameters, rather than forcing 4 GNN layers with smaller dimensionalities.

Furthermore, I completely agree with your suggestion about leveraging a foundation model. Since the level of theory for my dataset is PBE, I will also proceed with fine-tuning the PET-OMat series, which seems like a perfect fit to help close that error gap.

I really appreciate your guidance—it has given me a very clear direction on how to move forward. Thanks again!

frostedoyster · 2026-03-23T07:59:10Z

frostedoyster
Mar 23, 2026
Collaborator

Thanks @evenfarther for the question and @cesaremalosso for answering! I just wanted to add that it seems to me that the PET hyperparameters were heavily engineered to match the parameter count of some other model. This is, in my experience, not a good way of putting two models "on the same footing". As you might have noticed, different ways of distributing the same number of parameters within the architecture can result in massively different accuracy and/or speed numbers. In my opinion, a more robust way is to e.g. roughly match the inference speed of the two models and then measure the accuracy/performance in some test/exercise

1 reply

evenfarther Mar 26, 2026
Author

Thanks for the awesome response and the great code! It's really cool getting this kind of practical advice. I completely agree with your take on matching inference speed rather than just hacking parameter counts. I'm really looking forward to diving into the code and experimenting with your suggestions!
Once again, thank you for your support!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error gap between train and valid sets #116

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Error gap between train and valid sets #116

Uh oh!

Uh oh!

evenfarther Mar 17, 2026

Replies: 2 comments · 2 replies

Uh oh!

Uh oh!

cesaremalosso Mar 18, 2026 Collaborator

Uh oh!

Uh oh!

evenfarther Mar 18, 2026 Author

Uh oh!

frostedoyster Mar 23, 2026 Collaborator

Uh oh!

evenfarther Mar 26, 2026 Author

evenfarther
Mar 17, 2026

Replies: 2 comments 2 replies

cesaremalosso
Mar 18, 2026
Collaborator

evenfarther Mar 18, 2026
Author

frostedoyster
Mar 23, 2026
Collaborator

evenfarther Mar 26, 2026
Author