Skip to content

Conversation

@gisetia
Copy link
Member

@gisetia gisetia commented Dec 1, 2025

Expose an OpenAI-compatible /chat/completions endpoint (via FastAPI) so LibreChat can call our agent as a custom endpoint. This keeps the same request/response shape while letting us fully track and control the agent behavior in our stack.

@forus
Copy link
Contributor

forus commented Dec 1, 2025

@gisetia brilliant idea!

@forus
Copy link
Contributor

forus commented Dec 1, 2025

@gisetia Is there any LibreChat documentation that explains how custom endpoints interface with the system? When you add a custom endpoint, does it replace the default one, or does it appear as an additional option that users can select?

@gisetia
Copy link
Member Author

gisetia commented Dec 1, 2025

@forus

@gisetia Is there any LibreChat documentation that explains how custom endpoints interface with the system? When you add a custom endpoint, does it replace the default one, or does it appear as an additional option that users can select?

image

It appears in the top left corner as a provider from which you can choose a model.

@gisetia gisetia requested a review from inodb December 9, 2025 10:50
@gisetia gisetia marked this pull request as ready for review December 9, 2025 10:51
@gisetia gisetia requested review from forus, inodb and pieterlukasse and removed request for forus, inodb and pieterlukasse December 9, 2025 14:07
README.md Outdated
Both `detailed` and `brief` fields are optional. You may include either, both, or none, depending on the level of step detail you wish to provide.
- **LLM Outputs**: Markdown files containing SQL queries, named as `<question_number>.md`.

### Output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what's happened here that this text is shown with dark red background.

@app.post("/chat/completions")
async def chat_completions(req: ChatCompletionRequest):
print("-- -- [api] incoming request", req.model_dump(exclude_none=True))
if getattr(req, "model_extra", None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this model_extra field for?


#### Connecting to LibreChat

Connect the DB Agent to LibreChat as an OpenAI-compatible custom endpoint by adding this to librechat.yaml:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it's called DB Agent? Isn't this repo for the end solution, for the cBioPortal agent?

default: ["<DB-agent-name>"]
titleConvo: true
titleModel: "<DB-agent-name>"
modelDisplayLabel: "<DB-agent-name>"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all fields mandatory? which are? what do they mean? maybe add a brief description below.

async def chat_completions(req: ChatCompletionRequest):
print("-- -- [api] incoming request", req.model_dump(exclude_none=True))
if getattr(req, "model_extra", None):
print("-- -- [api] extra params ignored", req.model_extra)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it sounds like error/warning, can be printed to stderr instead of stdout by adding file=sys.stderr

await _asyncio_sleep(0.01)


async def _asyncio_sleep(delay: float):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

yield "data: [DONE]\n\n"


async def _small_sleep():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm

if getattr(req, "model_extra", None):
print("-- -- [api] extra params ignored", req.model_extra)

question = _extract_user_question(req.messages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we extract only last user question and don't take advantage of the whole history in this agent?

except ValueError as exc:
raise HTTPException(status_code=400, detail=str(exc))

answer = await llm_client.ask_question(question)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we loosing actual streaming here by waiting for the complete string?

Copy link
Member Author

@gisetia gisetia Dec 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLM client does not support streaming currently

if sql_markdown:
answer = f"{answer}\n\n---\n{sql_markdown}"

chunks = _chunk_answer(answer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we simulate chunking instead of using actual ones coming from the llm client?

@gisetia gisetia reopened this Dec 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants