-
Notifications
You must be signed in to change notification settings - Fork 1k
Pull requests: huggingface/text-generation-inference
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[DOCS] Add Google Cloud TGI integration via dedicated DLCs
#2612
opened Oct 5, 2024 by
alvarobartt
Loading…
1 of 5 tasks
feat: propagate max_concurrent_requests to queue state entries instead of hardcoded value in v2 and v3 backends
#2578
opened Sep 26, 2024 by
Venkat2811
Loading…
feat: enable pytorch xpu support for non-attention models
#2561
opened Sep 24, 2024 by
dvrogozh
Loading…
do not set sliding_window if SUPPORTS_WINDOWING is false
#2554
opened Sep 24, 2024 by
sywangyi
Loading…
5 tasks
CI for add gptq and awq int4 support in intel platform
#2494
opened Sep 5, 2024 by
ErikKaum
Loading…
fix: skip cuda graphs that will oom and improve free memory logging
#2450
opened Aug 22, 2024 by
drbh
Loading…
add gptq and awq int4 support in intel platform
#2444
opened Aug 22, 2024 by
sywangyi
Loading…
5 tasks
[TENSORRT-LLM] - Implement new looper thread based backend
#2357
opened Aug 2, 2024 by
mfuntowicz
•
Draft
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-09-05.