Document search and retrieval with deep learning (part of ML institute programme)
- Ensure Python is installed on the system. If you need a specific version eg 3.9, run:
add-apt-repository ppa:deadsnakes/ppa
apt-get update
apt-get install python3.9
- Install PDM with
curl -sSL https://pdm-project.org/install-pdm.py | python3 -
(you may need to add to your path after doing this to allow running thepdm
command, see output of installation script for details) - Point PDM to a python interpreter - for example if installed python 3.9 in step 1 run
pdm use python3.9
. PDM will automatically create a virtual environment in the.venv
folder - Run
source .venv/bin/activate
to activate the virtual environment - Run
pdm install
to install dependencies
- Run
pdm run load
to preprocess the dataset into a csv - Run
pdm run train
to train the model in minimode
- Run
./ssh.sh
, providing ip and port when prompted to open vscode on the GPU remotely - Follow steps from setup to install PDM and python on the GPU
- Run
FULLRUN=1 pdm run load
to preprocess the dataset into a csv - Run
FULLRUN=1 pdm run train
to train the model in full mode with all the data
- Run
docker compose -f docker-compose.dev.yml up
to spin up the chroma (vector database) instance - Run
pdm run cache
to run a script that stores the encoded vectors for each document in chroma - Run
pdm run serve
to launch the web server. It should open on http://localhost:8080
By default inference is run using model weights downloaded from wandb (see src/util/artifacts.py
). Override these by setting env variables, for example to override the weights for the projector during caching you could run DOC_PROJECTOR_WEIGHTS_PATH=data/epoch-weights/doc-weights_epoch-30.generated.pt pdm run cache
- Run
pdm run build
to build the server docker image and push it to docker hub - Open
inventory.ini
and update to reflect the ip and port of the server you want to deploy to - Run
pdm run ansible
to run theplaybook.yml
file which should ssh to the remote server and launch a chroma instance and the server.
Note that the server will take a long time to startup because it needs to run the document caching logic from pdm run cache
to insert the document vectors into chroma before it can handle requests. You can check its progress by ssh-ing to the server and running sudo docker compose -f /root/mlx/docker-compose.yml logs server