F-DATA

F-DATA: A Fugaku Workload Dataset for Job-centric Predictive Modelling in HPC Systems

This repository contains the scripts and documentation for the F-DATA, available in Zenodo .

Instruction on how to load the data

The files of F-DATA are saved as .parquet files. It is possible to load such files as dataframes by leveraging the pandas APIs, after installing pyarrow (pip install pyarrow). A single file can be loaded as follows:

# Importing pandas library
import pandas as pd 

# Read the 21_01.parquet file in a dataframe format
df = pd.read_parquet("21_01.parquet")
df.head()

Repository structure

baseline_experiments.py: The script to execute ML predictive modelling on the F-DATA.
generate_plots.py: The script to generate a series of plots.
requirements.txt: The python dependencies to execute all the scripts in the repository.
docs: The folder contains some documentation of the final dataset, such as the job feature list and description.
plots : The folder contains the plots of the whole F-DATA, as well as of the single splits that can be found in Zenodo.
generation_scripts: The folder contains the scripts used to anonymize the data and generate the derived features.

Contact us

For any information on F-DATA don't hesitate to contact us at: francesco.antici98[at]gmail.com.

Cite us

Please cite the work as

@article{antici2025fdata,
  title={F-DATA: A Fugaku Workload Dataset for Job-centric Predictive Modelling in HPC Systems},
  author={Antici, Francesco and Bartolini, Andrea and Domke, Jens and Kiziltan, Zeynep and Yamamoto, Keiji},
  journal = {Scientific Data},
  volume={12},
  pages={1321},
  year={2025},
  publisher={Nature Publishing Group},
  doi={https://doi.org/10.1038/s41597-025-05633-1}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

F-DATA

Instruction on how to load the data

Repository structure

Contact us

Cite us

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docs		docs
generation_scripts		generation_scripts
plots		plots
.gitignore		.gitignore
README.md		README.md
baseline_experiments.py		baseline_experiments.py
generate_plots.py		generate_plots.py
requirements.txt		requirements.txt

francescoantici/F-DATA

Folders and files

Latest commit

History

Repository files navigation

F-DATA

Instruction on how to load the data

Repository structure

Contact us

Cite us

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages