This repository contains code of Shapley-Guided Utility Learning for Effective Graph Inference Data Valuation.
-
preprocess.py
- Preprocesses the graph dataset for GNN training and evaluation.
- Input: Raw graph data
- Output: Processed graph data ready for GNN consumption
-
train_gnn.py
- Trains the GNN model on the preprocessed data.
- Input: Processed graph data
- Output: Trained GNN model
-
valid_perm_sample.py
- Generates validation permutation samples for model evaluation.
- Input: Trained GNN model, validation data
- Output: Validation permutation samples
-
test_perm_sample.py
- Generates test permutation samples for final model evaluation.
- Input: Trained GNN model, test data
- Output: Test permutation samples
-
atc_confidence_estimation.py
- Estimates confidence using the Adaptive Test Confidence (ATC) method.
- Input: Validation permutation samples
- Output: ATC confidence estimates
-
atc_ne_confidence_estimation.py
- Estimates confidence using the ATC with Negative Entropy (ATC-NE) method.
- Input: Validation permutation samples
- Output: ATC-NE confidence estimates
-
training_statistics.py
- Computes and saves training set statistics for use in performance prediction.
- Input: Trained GNN model, training data
- Output: Training statistics
-
doc_performance_prediction.py
- Predicts model performance using the Difference of Confidence (DOC) method.
- Input: Training statistics, validation permutation samples
- Output: DOC performance predictions
-
shapley_regression_pred_lasso.py
- Performs Shapley regression prediction using LASSO regularization (our proposed method).
- Input: Shapley values, performance metrics
- Output: Regression model for performance prediction
-
shapley_estimation_drop_node.py
- Estimates Shapley values for nodes by dropping them from the graph.