Repro steps: 1. create docker containers via quick start script, or manually copy docker-compose.yml 2. edit the `.env` file to include `DISABLE_MODEL_SERVER=True` (or delete the `inference_model_server` service from the docker compose file) 3. start the docker containers with `docker compose start -d` Expected behavior: Onyx api and web server starts correctly without using inference_model_server Actual behavior: Onyx api crashes at search_nlp_models.py line 1069 Suggested fix: Modify search_nlp_models.py to respect DISABLE_MODEL_SERVER flag so that the app runs with model server off