This project focuses on preprocessing road network graphs and creating datasets for machine learning models. The src directory contains the implementations, and the data directory holds the preprocessed datasets.
├── .gitignore
├── environment.yml # The requirements file for reproducing the environment
├── initial_setup.md # Instructions for setting up the environment
├── data/ # Preprocessed datasets for various road networks
├── non_ml_index/ # Codebase for non-ML index evaluation
├── slurm-jobs/ # SLURM job scripts for running experiments on HPC clusters
├── results/ # Saved logs, plots, and model checkpoints
├── third_party/ # Third-party repositories or libraries
├── src/ # Source code for this project
├── LICENSE
└── README.md
Our code has been developed using PyTorch, PyTorch Geometric and Tensorflow in Python 3.10. Please refer to environment.yml for the complete list of dependencies and initial_setup.md for detailed instructions on setting up the environment.
# This will create environment (named `myenv`)
conda env create -f environment.yml-
Git clone the repository:
# Clone the repository to a specific directory git clone https://github.com/purduedb/shortest-distance-survey # Change to the project directory cd shortest-distance-survey
-
Activate the environment (follow instructions in
initial_setup.md):# Load the conda module if using an HPC cluster, else skip `module` commands module load conda # Activate the conda environment conda activate myenv
-
Use sample preprocessed datasets:
W_JinanorSurat_subgraphavailable indata/directory. Other workload-driven datasets are also available for download, here:W_Shenzhen,W_Chengdu,W_Beijing,W_Shanghai,W_NewYork,W_Chicago. Refer todata/README.mdfor more details on the datasets. -
Train and evaluate a model:
# Change to source directory cd ~/src # Run RNE model with Jinan dataset python train.py --model_class rne --data_dir W_Jinan --query_dir real_workload_perturb_500k
NOTE: Additional parameters, e.g., time_limit, learning_rate, seed, etc. may also be specified. Refer to argparse section in
train.pyfor full list. Refer toslurm-jobs/urban_expt.shfor other model configurations. -
Optionally, train multiple models using SLURM scripts:
# Change to directory containing scripts cd ~/slurm-jobs ## Run training script (modify configuration in the script as needed) # (Optional) Dry run bash urban_expt.sh # (Optional) Execute on local bash urban_expt.sh --execute # Execute through SLURM bash urban_expt.sh --execute --slurm
...
...
If you find this code useful, please cite the following:
...
Gautam Choudhary (PhD Student, Purdue University)
Email: gchoudha@purdue.edu
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.