Integration with Hugging Face endpoints #1082
-
|
(Premise: I'm totally n00b about AI, and about Cheshire 😸 I'm just onboarding) Dear Cheshire hackers, I'm trying to integrate Cheshire with literally whatever non-proprietary AI (e.g. avoid ChatGPT) and so I tried to connect Cheshire with Hugging Face, since it seems an interesting "neutral/agnostic AI proxy". As far as I understand, it's possible, e.g. the "Playground" of Hugging Face supports multiple stuff:
Hugging Face basically calls My Hugging Face token works with cURL: Respose: So, my Hugging Face token is authorized for "Make calls to Inference Providers" and it seems to work with cURL. QuestionHow to plug Cheshire to Hugging Face? what is the Endpoint Url? What I triedI tried selecting "HuggingFace Endpoint" with Endpoint Url In the console I get this error: Thanks for small help 🙏 Sorry for this totally stupid question. I'm probably using a wrong endpoint. P.S. official documentation (?) https://cheshirecat.ai/custom-large-language-model/ https://huggingface.co/docs/hub/api I'm a bit surprised to have not found nothing about this topic (?) https://github.com/cheshire-cat-ai/core/discussions?discussions_q=HuggingFace https://github.com/cheshire-cat-ai/core/discussions?discussions_q=Hugging+Face Thanks and sorry for being so n00b 🙏 🐱 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Hi @valerio-bozzolan , We had an adapter for the public API but for some reason (I remember low availability and too much variance in model inputs/outputs) we ditched it. Not sure at the moment what works with HF, and what not, and I'm not investing time in it. Most people running local models use Ollama or vLLM, or many other tools you can use via OpenAI-compatible adapter. Still you can write your own LLM adapter (see plugins already published about Groq or TogetherAI) Peace and thank you for playing with the cat ;) |
Beta Was this translation helpful? Give feedback.


Hi @valerio-bozzolan ,
huggingface "endpoints" are a service people use for production (dedicated paid endpoints)
We had an adapter for the public API but for some reason (I remember low availability and too much variance in model inputs/outputs) we ditched it.
Not sure at the moment what works with HF, and what not, and I'm not investing time in it. Most people running local models use Ollama or vLLM, or many other tools you can use via OpenAI-compatible adapter.
Still you can write your own LLM adapter (see plugins already published about Groq or TogetherAI)
Peace and thank you for playing with the cat ;)
Welcome