Parallel K-Means implementation in Julia (with the help of preprocessing in Matlab). The code should be able to run once you clone the repo. The code is meant to be run on Tempest using the batch scripts, but the individual scripts can get run locally using the scripts in serial/ and parallel/. The benchmarking for the code was based on parameters outlined in the batch scripts in benchmarking/. Computational efforts were performed on the Tempest High Performance Computing System, operated and supported by University Information Technology Research Cyberinfrastructure at Montana State University.
FILE STRUCTURE: batch: Contains sbatch files for running jobs on Tempest HPC.
benchmarking: Contains the benchmarking code for the serial and parallel versions of the K-Means algorithm. The reason this is separate is due to the code having to be slightly restructured in order to use the tools from BenchmarkingTools.jl (Although I should have structured it more like the benchmarking versions...).
data: Contains all the varying datasets and information from the preprocessing in order to accomplish everything. Almost all of the files are generated by the preprocessData.m script.
figures: Contains the Matlab code for generating the figures along with the actual figures that get saved.
parallel: Julia scripts for the parallel implementation of K-Means.
preprocessing: Matlab scripts for turning the original Matlab datacubes into shuffled matrices of N samples depending on the size of the dataset.
results: Contains results for both parallel and serial implementations. The parallel results are stored in folders labeled "tT", where T is the number of threads when running the Julia scripts.
serial: Julia scripts for the serial implementation of K-Means.