Noisy-Graph-Active-Learning

Official implementation of GALClean+ (Graph Active Learning and Cleaning) from the paper "Active Learning for Graphs with Noisy Structures".

Environment

Tested with Python 3.10 and PyTorch 2.5:

conda create -n galclean python=3.10 pip
conda activate galclean
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install numpy scipy scikit-learn pandas matplotlib seaborn tqdm networkx pyyaml ogb
pip install torch_geometric==2.6.1
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.5.0+cu121.html

Quick Start

Run GALClean+ on Cora with default settings:

python train.py --data Cora --noise_level 0.0 --rand_seed 0 --init_split 0

Outputs are written to outputs/ as:

outputs/<run_id>/metrics.json
outputs/<run_id>/selected_indices.json

metrics.json records the paper-aligned settings used for the run.

Defaults (paper-aligned in this repo):

Planetoid split: public_fixed
Random noise sampling: any
Edge filtering: legacy

Splits: Planetoid (Cora, CiteSeer, PubMed) uses the legacy fixed split (public_fixed). Amazon/Coauthor use random.

Budget: Default is 2 initial nodes/class + 8 selected nodes/class = 10 total/class.

Experiments

Random Edge-Adding Attack:

python train.py --data Cora --noise_level 0.6

Unsupervised Adversarial Attack (CLGA):

python train.py --data Cora --attack clga --attack_percent 20

This repo ships the paper CLGA attacked graphs under data/attacks/clga/: Cora/CiteSeer/PubMed/Amazon-Photo/Coauthor-CS at 5/10/15/20%.

Sweeps + plots

Random noise sweep (defaults to noise levels 0.0..1.0 step 0.2):

python scripts/run_sweep.py --mode random --datasets Cora CiteSeer PubMed --seeds 0 --splits 0
python scripts/collect_results.py
python scripts/plot_results.py --scale_percent

CLGA sweep:

python scripts/run_sweep.py --mode clga --datasets Cora CiteSeer PubMed Amazon-Photo Coauthor-CS --seeds 0 --splits 0
python scripts/collect_results.py
python scripts/plot_results.py --scale_percent

Citation

@article{chi2024active,
  title={Active Learning for Graphs with Noisy Structures},
  author={Chi, Hongliang and Qi, Cong and Wang, Suhang and Ma, Yao},
  journal={arXiv preprint arXiv:2402.02321},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
pygcn		pygcn
scripts		scripts
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
models.py		models.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Noisy-Graph-Active-Learning

Environment

Quick Start

Experiments

Sweeps + plots

Citation

About

Uh oh!

Releases

Packages

Languages

frankhlchi/Noisy-Graph-Active-Learning

Folders and files

Latest commit

History

Repository files navigation

Noisy-Graph-Active-Learning

Environment

Quick Start

Experiments

Sweeps + plots

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages