Instructions for reproducing the experiments reported in our paper [Directed Graph Auto-Encoders] (https://arxiv.org/abs/2202.12449), published in AAAI 2022.
If our code is helpful for your research, please cite our work:
@inproceedings{gkolliasAAAI22,
author = {Georgios Kollias and
Vasileios Kalantzis and
Tsuyoshi Id\'e and
Aur\'elie Lozano and
Naoki Abe},
title = {Directed Graph Auto-Encoders},
booktitle = {Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI 2022)},
month = {February},
year = {2022}
}
The following Python packages are required in addition to standard tensorflow and pytorch machine learning frameworks:
torch-geometric: https://github.com/rusty1s/pytorch_geometricgravity-gae: https://github.com/deezer/gravity_graph_autoencoders
Copy all scripts under code/scripts/ to the top code/ directory.
Will use feature-based cora_ml and citeseer datasets under data/cora_ml/raw and data/citeseer/raw.
These were originally utilized in "Deep gaussian embedding of graphs: Unsupervised inductive learning via ranking", Aleksandar Bojchevski and Stephan Günnemann.
https://github.com/abojchevski/graph2gauss
Execute citation_grid_search.sh to generate json files with performance metrics results for all dataset and model combinations and for all hyperparameter values in the relevant search grid for each such combination as defined in the manuscript. Perform 5 repetitions (different graph splits) per configuration and train for 200 epochs for each repetition. Example command:
python train.py --dataset=cora_ml --model=digae --alpha=0.0 --beta=0.2 --epochs=200 --nb_run=5 --logfile=digae_cora_ml_grid_search.json --learning_rate=0.005 --hidden=64 --dimension=32 --validate=TrueHyperparameters for best mean AUC results are selected for the final model runs: citation_run.sh collects relevant commands.
Execute citation_run.sh to generate json files with performance metrics results for all dataset and model combinations for the selected hyperparameter values. Perform 20 repetitions per configuration and train for 200 epochs for each repetition. Example command:
python gravity_train.py --dataset=citeseer --model=gravity_gcn_ae --epochs=200 --nb_run=20 --logfile=run_features.json --learning_rate=0.005 --hidden=64 --dimension=32 --validate=False --lamb=0.1 --load_features=TrueExecute citation_svd_run.sh for all datasets and for both SVD and Randomized SVD approaches and for k=2,4,8,16,32,64,128 to generate json files with performance metrics results. Perform 20 repetitions per configuration. Example command:
python train.py --dataset=cora_ml --model=dummy_pair --epochs=10 --nb_run=20 --validate=False --feature_vector_type=svd --feature_vector_size=32 --logfile=svd_cora_ml_runs.jsonWill use feature-based texas, cornell and wisconsin datasets under the corresponding folders in data/.
In torch-geometric they can be imported through torch_geometric.datasets.WebKB class.
Execute webkb_grid_search.sh to generate json files with performance metrics results for all dataset and model combinations and for all hyperparameter values in the relevant search grid for each such combination as defined in the manuscript. Perform 5 repetitions (different graph splits) per configuration and train for 200 epochs for each repetition. Example command:
python train.py --dataset=texas --model=digae_single_layer --alpha=0.0 --beta=0.0 --epochs=200 --nb_run=5 --logfile=texas_grid_search.json --learning_rate=0.005 --hidden=32 --dimension=16 --validate=True Hyperparameters for best mean AUC results are selected for the final model runs: webkb_run.sh collects relevant commands.
Execute webkb_run.sh to generate json files with performance metrics results for all dataset and model combinations for the selected hyperparameter values. Perform 20 repetitions per configuration and train for 200 epochs for each repetition. Example command:
python train.py --dataset=wisconsin --model=digae_single_layer --alpha=0.8 --beta=0.8 --epochs=200 --nb_run=20 --logfile=webkb_run_features.json --learning_rate=0.005 --hidden=64 --dimension=32 --validate=False --feature_vector_type=None