The current model operates on Retrieval Augmented Generation (RAG) and integrates optional retrieval techniques like TFIDF,
SVM, and Ensemble to fetch documents for language model (LLM) such as Vicuna, enhancing its generation capabilities.
Run this project by simply following the instructions:
git clone https://github.com/semantic-systems/nfdi-search-engine-chatbot.git
cd nfdi-search-engine-chatbot
conda create -n nfdi_search_engine_chatbot python=3.9
conda activate nfdi_search_engine_chatbot
pip install -r requirements.txt
cp .env-example .envModify the .env file and ad your keys and variables there. The run the app:
streamlit run app.py
Then, you can now view your streamlit app in your browser.
- Clone the repository to your local machine:
git clone https://github.com/semantic-systems/nfdi-search-engine-chatbot.git
cd nfdi-search-engine-chatbot- Create a virtual environment with
python=3.9, activate it, install the required dependencies and install the pre-commit configuration:
conda create -n nfdi_search_engine_chatbot python=3.9
conda activate nfdi_search_engine_chatbot
pip install -r requirements.txt
pre-commit install- Create a branch and commit your changes:
git switch -c <name-your-branch>
# do your changes
git add .
git commit -m "your commit msg"
git push- Merge request to
mainfor review.
Using Docker
- Create a
.envfile similar to.env-exampleand add theVICUNA_KEYandVICUNA_URLthere. - Running using Dockerfile
docker build -t nfdisearchchatbot .
docker run -d -p 6000:6000 nfdisearchchatbot
- Test whatever everything is set up and works http://0.0.0.0:5000/ping
Using docker-compose
- Create a
.envfile similar to.env-exampleand add theVICUNA_KEYandVICUNA_URLthere. - Run the following command line:
docker-compose up
- Test whatever everything is set up and works: http://0.0.0.0:6000/ping
Request URL: http://0.0.0.0:6000/chat Request Body:
{
"question": "You are talking about who?",
"chat-history": [],
"search-results": [
{
}
]
}Respond:
{
"question": "You are talking about who?",
"chat-history": [{"input": "You are talking about who?", "output": "......."}],
"search-results": [
{
}
]
}@inproceedings{babaei2024scholarly,
title={Scholarly Question Answering Using Large Language Models in the NFDI4DataScience Gateway},
author={Babaei Giglou, Hamed and Taffa, Tilahun Abedissa and Abdullah, Rana and Usmanova, Aida and Usbeck, Ricardo and D’Souza, Jennifer and Auer, S{\"o}ren},
booktitle={International Workshop on Natural Scientific Language Processing and Research Knowledge Graphs},
pages={3--18},
year={2024},
organization={Springer}
}
