Skip to content

Latest commit

 

History

History
34 lines (31 loc) · 1.44 KB

File metadata and controls

34 lines (31 loc) · 1.44 KB

Data Preparation for PFLU

This project reuses the datasets from FedLU.
Please download the datasets from the FedLU repository and place them in this data/ directory following the required structure.

Dataset Structure

After downloading, your data directory should look like:

data/
├── FB15k-237/
│   ├── C3FL/
│   │   ├── 0/
│   │   ├── 1/
│   │   └── ...
│   ├── C5FL/
│   │   └── ...
│   └── ...

Data Preprocessing

Before running federated training and unlearning, you need to preprocess the datasets using scripts in code/preprocess/. The most important scripts are:

  • filter_related.py
    Filters the related set associated with the unlearning set, ensuring only relevant data is included for unlearning propagation.
  • relation_encode.py
    Encodes entities by their relations, generating relation mappings for each entity.
  • select_optimal_anchor.py
    Selects anchors for each client using a maximum coverage strategy.
  • unlearn_update_encode.py
    Updates relation and anchor mappings after unlearning to keep data and model representations consistent.

Please refer to each script for detailed usage and parameter options.


Note:

  • The datasets are not included in this repository.
  • For more details on the dataset format and splits, please refer to FedLU.