Skip to content

fix: use exec in endpointry.sh to handle signal correctly via uvicorn #94

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 28, 2024

Conversation

co42
Copy link
Contributor

@co42 co42 commented Oct 23, 2024

Right now when a pod running this container is deleted a SIGTERM is sent to entrypoint.sh which is not forwarded to uvicorn.
There is no graceful shutdown and it will continue to run until a SIGKILL stop it abruptly.

@co42 co42 marked this pull request as draft October 23, 2024 14:26
@co42 co42 marked this pull request as ready for review October 23, 2024 14:53
@alvarobartt alvarobartt added the bug Something isn't working label Oct 28, 2024
@alvarobartt alvarobartt merged commit 902c268 into main Oct 28, 2024
6 checks passed
@alvarobartt alvarobartt deleted the fix/entrypoint-graceful-shutdown branch October 28, 2024 07:52
alvarobartt added a commit to huggingface/Google-Cloud-Containers that referenced this pull request Oct 28, 2024
alvarobartt added a commit to huggingface/Google-Cloud-Containers that referenced this pull request Oct 30, 2024
* Add `pytorch/inference/gpu/2.3.1/transformers/4.46.0/py311` (WIP)

- Include missing `requirements.txt` installation in `entrypoint.sh`
(required to install custom dependencies with custom models)
- Fix Python 3.11 installation as it was not properly installed and
Python 3.10 was used instead
- Use `uv` to install the dependencies as it's way faster than default
`pip`
- Also `uv` is able to successfully install `kenlm` which is a
`transformers` dependency that `pip` is not able to install when
building the `Dockerfile`
- Tested with some of the latest models that those bumped dependencies
support as Gemma2, Llama3.2, StableDiffusion 3.5, and much more

* Remove `uv` and don't upgrade `setuptools`

Just by fixing the Python 3.11 and the `pip` installation, the
installation issue affecting `kenlm` is solved already; so no need to
add `uv` for the moment even though it would be a nice addition

* Add `pytorch/inference/cpu/2.3.1/transformers/4.46.0/py311`

* Update `pip install` syntax when installing from URL

* Add `exec` to `uvicorn` in `entrypoint.sh`

Kudos to @co42 for the catch at
huggingface/huggingface-inference-toolkit#94

* Remove extra line-break in `Dockerfile`

* Update `HF_INFERENCE_TOOLKIT_VERSION` to 0.5.1

See the latest `huggingface-inference-toolkit` release at
https://github.com/huggingface/huggingface-inference-toolkit/releases/tag/0.5.1

* Bump `transformers` to 4.46.1 in `huggingface-inference-toolkit`

`transformers` 4.46.0 was yanked because Python 3.8 support was
unintentionally dropped, whilst also fixes some issues affecting both
`torch.fx` and `onnx`

Co-authored-by: Philipp Schmid <[email protected]>

---------

Co-authored-by: Philipp Schmid <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants