-
-
Notifications
You must be signed in to change notification settings - Fork 1
Add FastAPI OpenAI-compatible endpoint for ask command #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@gisetia brilliant idea! |
|
@gisetia Is there any LibreChat documentation that explains how custom endpoints interface with the system? When you add a custom endpoint, does it replace the default one, or does it appear as an additional option that users can select? |
It appears in the top left corner as a provider from which you can choose a model. |
README.md
Outdated
| Both `detailed` and `brief` fields are optional. You may include either, both, or none, depending on the level of step detail you wish to provide. | ||
| - **LLM Outputs**: Markdown files containing SQL queries, named as `<question_number>.md`. | ||
|
|
||
| ### Output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what's happened here that this text is shown with dark red background.
src/cbioportal_mcp_qa/api.py
Outdated
| @app.post("/chat/completions") | ||
| async def chat_completions(req: ChatCompletionRequest): | ||
| print("-- -- [api] incoming request", req.model_dump(exclude_none=True)) | ||
| if getattr(req, "model_extra", None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this model_extra field for?
|
|
||
| #### Connecting to LibreChat | ||
|
|
||
| Connect the DB Agent to LibreChat as an OpenAI-compatible custom endpoint by adding this to librechat.yaml: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why it's called DB Agent? Isn't this repo for the end solution, for the cBioPortal agent?
| default: ["<DB-agent-name>"] | ||
| titleConvo: true | ||
| titleModel: "<DB-agent-name>" | ||
| modelDisplayLabel: "<DB-agent-name>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all fields mandatory? which are? what do they mean? maybe add a brief description below.
src/cbioportal_mcp_qa/api.py
Outdated
| async def chat_completions(req: ChatCompletionRequest): | ||
| print("-- -- [api] incoming request", req.model_dump(exclude_none=True)) | ||
| if getattr(req, "model_extra", None): | ||
| print("-- -- [api] extra params ignored", req.model_extra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it sounds like error/warning, can be printed to stderr instead of stdout by adding file=sys.stderr
| await _asyncio_sleep(0.01) | ||
|
|
||
|
|
||
| async def _asyncio_sleep(delay: float): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
| yield "data: [DONE]\n\n" | ||
|
|
||
|
|
||
| async def _small_sleep(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm
src/cbioportal_mcp_qa/api.py
Outdated
| if getattr(req, "model_extra", None): | ||
| print("-- -- [api] extra params ignored", req.model_extra) | ||
|
|
||
| question = _extract_user_question(req.messages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we extract only last user question and don't take advantage of the whole history in this agent?
| except ValueError as exc: | ||
| raise HTTPException(status_code=400, detail=str(exc)) | ||
|
|
||
| answer = await llm_client.ask_question(question) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we loosing actual streaming here by waiting for the complete string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LLM client does not support streaming currently
| if sql_markdown: | ||
| answer = f"{answer}\n\n---\n{sql_markdown}" | ||
|
|
||
| chunks = _chunk_answer(answer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we simulate chunking instead of using actual ones coming from the llm client?

Expose an OpenAI-compatible /chat/completions endpoint (via FastAPI) so LibreChat can call our agent as a custom endpoint. This keeps the same request/response shape while letting us fully track and control the agent behavior in our stack.