generated from FAIRmat-NFDI/nomad-plugin-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
AutoXRD is well-suited for temporal workflows, since it can be long-running. Training can be lengthy, depending on the number of epochs and the size of the training data. Inference can also be long, depending on how many XRD patterns are being processed together.
To this end, some restructuring is needed, primarily restructuring current training and analysis modules into functions that can be later used to define activities. These functions can be set up as:
- Data preprocessing for training data - CIFs are the input for training and need to be processed into spectrums while also adding data augmentation. These functions will be based on the
spectrum_generation,solid_solns, andtabulate_cifmodules of theXRD-AutoAnalyserpackage. - Data batching for training
- Running the training loop and saving the models
- Data preprocessing for inference data - XRD data entries from NOMAD are the input for the inference and need to be processed into spectra that can be used for analysis. These functions extract the data from the entry, interpolate the pattern to match the input size for the model, and convert it into a PDF if using a PDF model.
- Data postprocessing after inference - The predicted phases are processed together with the original pattern for postprocessing. These functions will be based on the
SpectrumAnalyser.enumerate_routesmethod of theXRD-AutoAnalyserpackage. - Data batching for running inference
The current functions for training and analysing using NOMAD entries will need to be adapted and made part of schema_packages.jupyter
Metadata
Metadata
Assignees
Labels
No labels