-
Notifications
You must be signed in to change notification settings - Fork 20
api inference mini fork #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
oOraph
wants to merge
26
commits into
main
Choose a base branch
from
dev/api-inference-mini-fork
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
6d08e97
customize default num inference steps
oOraph 7e01334
default content type env var
oOraph 2d39740
default accept env var
oOraph d41f536
content type case ignore
oOraph b0f1b2d
Diffusers, txt2img (and img2img when supported), make sure guidance s…
oOraph 77e870a
api inference compat response
oOraph fc71ab9
fix: content-type and accept parsing
oOraph 60745f3
Multi task support + /pipeline/<task> support for api-inference backw…
oOraph ba52d1e
substitute /pipeline/sentence-embeddings to /pipeline/feature-extract…
oOraph 33d23f3
application/octet-stream support in content type deserialization
oOraph 5bbf5a9
fix(api inference): compat for text-classification token-classification
oOraph 422c7b2
fix: token classification api-inference-compat
oOraph ae367fd
add timm dependency (for object detection)
oOraph 1565769
fix(api-inference): feature-extraction, flatten array, discard the ba…
oOraph 3c75bcb
minor: make quality
oOraph f6e1f85
install hf_xet
oOraph d14b5c7
fix: avoid returning none as a serializer
oOraph 603ce84
fix: de/serializer is not optional, do not support content type which…
oOraph 77d9b12
feat(memory): reduce memory footprint on idle service
oOraph bb1eded
Dockerfile refacto: split requirements and source code layers
oOraph 088cad0
fix: minor, idle unload distinguish sleep time and timeout
oOraph 2eda42a
fix: image segmentation on hf inference
oOraph 0bdb7c2
feat(hf-inference): disable custom handler
oOraph 3daa1ad
minor: dockerfile
oOraph bb2a6c3
quality check
oOraph a781375
fix tests
oOraph File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
kenlm@ git+https://github.com/kpu/kenlm@ba83eafdce6553addd885ed3da461bb0d60f8df7 | ||
transformers[audio,sentencepiece,sklearn,vision]==4.51.3 | ||
huggingface_hub[hf_transfer,hf_xet]==0.31.1 | ||
Pillow | ||
librosa | ||
pyctcdecode>=0.3.0 | ||
phonemizer | ||
ffmpeg | ||
starlette | ||
uvicorn | ||
gunicorn | ||
pandas | ||
orjson | ||
einops | ||
timm | ||
sentence_transformers==4.0.2 | ||
diffusers==0.33.1 | ||
accelerate==1.6.0 | ||
torch==2.5.1 | ||
torchvision | ||
torchaudio | ||
peft==0.15.1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,6 @@ | ||
import os | ||
|
||
|
||
def strtobool(val: str) -> bool: | ||
"""Convert a string representation of truth to True or False booleans. | ||
True values are 'y', 'yes', 't', 'true', 'on', and '1'; false values | ||
|
@@ -20,3 +23,11 @@ def strtobool(val: str) -> bool: | |
raise ValueError( | ||
f"Invalid truth value, it should be a string but {val} was provided instead." | ||
) | ||
|
||
|
||
def api_inference_compat(): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. with this env var we intend to handle the small response differences between the api inference widgets on the hub and on endpoints ui. TODO: we should probably unify both widgets instead |
||
return strtobool(os.getenv("API_INFERENCE_COMPAT", "false")) | ||
|
||
|
||
def ignore_custom_handler(): | ||
return strtobool(os.getenv("IGNORE_CUSTOM_HANDLER", "false")) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useful for sd 3.5 turbo -> we want guidance scale 0 by default (e.g when not specified by user) because the num steps is too low, so that generated images are ok