Skip to content

Latest commit

 

History

History
121 lines (91 loc) · 5.92 KB

File metadata and controls

121 lines (91 loc) · 5.92 KB

🌍🔍 Topic Modelling Research in the Digital Circular Electrochemical Economy (DCEE) Project

Journal arXiv Dataset DOI CC BY-NC 4.0 Python 3.8+ Apache 2.0

The Topic Modelling research repository for the Digital Circular Electrochemical Economy (DCEE) project at Heriot-Watt University. This research is funded by Digital Circular Electrochemical Economy (EP/V042432/1), and the UK Research and Innovation (UKRI) Interdisciplinary Centre for Circular Chemical Economy (EP/V011863/1 and EP/V011863/2). In response to this call, we have united a cross-disciplinary team of leading researchers from three UK universities: Imperial College London, Loughborough University, and Heriot-Watt University.

📊 Data and Results

The main dataset for this project is now publicly available via the university's open access repository:

The datasets and experimental results are made publicly available following the 🔐 EPSRC Data Storage Policy and 📜 GDPR Regulations.

🏆 Publication

The paper has been published in the JCR Q1 Elsevier journal Energy and AI 🎊

The preprint is available on arXiv 🔥

⚙️ How to Use

Creating a 🐍 Python 3.8 Environment

To ensure compatibility with the code, it is recommended to create a Python 3.8 virtual environment. Follow these steps:

Option 1: Using virtualenv
  1. Install Python 3.8 and virtualenv if you haven't already.
  2. Create a virtual environment:
    virtualenv -p python3.8 venv
  3. Activate the virtual environment:
    • On Windows:
      venv\Scripts\activate
    • On Unix or MacOS:
      source venv/bin/activate
  4. Install the required packages:
    pip install -r requirements.txt
Option 2: Using conda
  1. Install Anaconda or Miniconda if you haven't already.
  2. Create a conda environment with Python 3.8:
    conda create --name dcee python=3.8
  3. Activate the conda environment:
    conda activate dcee
  4. Install the required packages:
    pip install -r requirements.txt

🚀 Running the Scripts

The repository contains scripts for different models (BERTopic, CorEx, LDA) and preprocessing steps. You can find the scripts in the scripts directory. Each subdirectory contains Jupyter notebooks (.ipynb) and Python scripts (.py) for Single-objective Optimisation and BERTopic contains Single and Multi-objective Optimisation.

To run a specific script, navigate to its directory and execute the script. For example:

cd scripts/bertopic
python bert_grid_guardian.py

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Dataset License: The dataset is released under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

📬 Contact

For any questions or further information, please contact the project team at 🌐 Digital Circular Electrochemical Economy (DCEE) Project and 🏛️ National Interdisciplinary Centre for the Circular Chemical Economy.


Citation

If you use our code or refer to our publication, please cite the following BibTeX:

@article{song2024exploring,
  title={Exploring public attention in the circular economy through topic modelling with twin hyperparameter optimisation},
  author={Song, Junhao and Yuan, Yingfang and Chang, Kaiwen and Xu, Bing and Xuan, Jin and Pang, Wei},
  journal={Energy and AI},
  pages={100433},
  year={2024},
  publisher={Elsevier}
}

If you use our dataset, please cite the following entry in your BibTeX:

@dataset{song2025public,
  author    = {Song, Junhao and Yuan, Yingfang and Chang, Kaiwen and Xu, Bing and Xuan, Jin and Pang, Wei},
  title     = {Public Attention Text Dataset on Circular Economy for Topic Modelling},
  year      = {2025},
  publisher = {Heriot-Watt University},
  doi       = {10.17861/85bf3f9d-dc42-4b5c-8e29-47ddd0f0f687},
  url       = {https://doi.org/10.17861/85bf3f9d-dc42-4b5c-8e29-47ddd0f0f687},
  note      = {EAI2024Data(.zip)}
}