Skip to content

Add OpenAI-compatible custom endpoints as fourth LLM option#34

Open
SimplePotat wants to merge 7 commits into
ronron-gh:mainfrom
SimplePotat:main
Open

Add OpenAI-compatible custom endpoints as fourth LLM option#34
SimplePotat wants to merge 7 commits into
ronron-gh:mainfrom
SimplePotat:main

Conversation

@SimplePotat
Copy link
Copy Markdown

@SimplePotat SimplePotat commented May 21, 2026

Adds a new LLM type (LLM_TYPE_CUSTOM_OPENAI = 4) that lets the ChatGPT client target any OpenAI-compatible HTTP endpoint instead of just api.openai.com, default behavior for existing configs is unchanged.

Notes:

  • "Robot::initLLM" falls through "LLM_TYPE_CUSTOM_OPENAI" to the existing "ChatGPT" client
  • "ChatGPT::https_post_json" is renamed to "ChatGPT::post_json" since it now handles both schemes. Downstream forks subclassing "ChatGPT" or calling it directly will need to update the name.
  • Fully backward compatible. Configs that don't set customEndpoint use the existing OpenAI flow unchanged.
  • Two new optional keys under "llm:" in "SC_ExConfig.yaml":
llm:
  type: 4
  customEndpoint: "http://192.168.X.XXX:8080/v1/chat/completions"
  # or, for https:
  # customEndpoint: "https://my-llm.example.com/v1/chat/completions"
  # customRootCAFile: "/customRootCA.pem"   # required for https endpoints
  • customRootCAFile is a path on the SD card to a PEM-formatted root CA.

Security Decisions:

  • https:// with no CA loaded = request refused, if no custom CA is loaded but the user sets a custom https endpoint the request fails at send time and an error message is displayed on Stack Chan with a sad face. A silent fallback would either send the configured API key to the wrong host or downgrade an https intent to an unverified connection.

@SimplePotat
Copy link
Copy Markdown
Author

※AIによる翻訳です。不自然な箇所があるかもしれませんがご容赦ください。


ChatGPTクライアントが api.openai.com だけでなく、任意のOpenAI互換HTTPエンドポイントをターゲットにできる新しいLLMタイプ(LLM_TYPE_CUSTOM_OPENAI = 4)を追加。既存の設定におけるデフォルトの動作に変更はない。

備考:

  • Robot::initLLM は LLM_TYPE_CUSTOM_OPENAI の場合、既存の ChatGPT クライアントへフォールスルーする。
  • 両方のスキーム(http/https)を処理するようになったため、ChatGPT::https_post_json を ChatGPT::post_json にリネーム。ChatGPT をサブクラス化している、または直接呼び出している下流のフォークでは、名前の更新が必要となる。
  • 完全な後方互換性を維持。customEndpoint が設定されていないコンフィグでは、既存のOpenAIの処理フローがそのまま使用される。
  • SC_ExConfig.yaml の llm: 配下に2つの新しいオプションキーを追加:
llm:
  type: 4
  customEndpoint: "http://192.168.X.XXX:8080/v1/chat/completions"
  # または、httpsの場合は以下のように設定:
  # customEndpoint: "https://my-llm.example.com/v1/chat/completions"
  # customRootCAFile: "/customRootCA.pem"   # httpsエンドポイントでは必須
  • customRootCAFile は、SDカード上にあるPEM形式のルートCA証明書へのパスである。

セキュリティに関する設計判断:

  • CAがロードされていない状態での https:// リクエストは拒否される。カスタムCAがロードされていないにもかかわらず、ユーザーがカスタムhttpsエンドポイントを設定した場合、送信時にリクエストが失敗し、スタックチャンの画面に悲しい顔と共にエラーメッセージが表示される。これをサイレントにフォールバックさせてしまうと、設定したAPIキーを誤ったホストに送信してしまうか、あるいはhttpsの意図を未検証の接続へとダウングレードしてしまうことになるため、このような仕様としている。

@ronron-gh
Copy link
Copy Markdown
Owner

Thank you for your PR. We are currently withholding a decision on whether to merge it for the following reasons. We appreciate your consideration.

  • I am unable to provide an "arbitrary OpenAI-compatible API," and therefore cannot test or maintain it on my end. (Are you envisioning something like a local LLM or Azure OpenAI?)

  • I am unsure whether there is a need for this feature among general AI_StackChan_Ex users (I don't want to add unnecessary features and complicate things). There is absolutely no problem with you making a forked repository public for a select group of users.

  • If you release it to general users, please update the README so that it explains what the features are and how to use them.

@SimplePotat
Copy link
Copy Markdown
Author

SimplePotat commented May 23, 2026

Thanks for taking the time to review the PR and thank you for your wonderful project! I'll leave the ultimate decision on whether its worth merging or keeping in a separate fork up to you.

  • As far as OpenAI compatible endpoints are concerned this covers both remote endpoints such as Openrouter, Claude via the official Anthropic compatibilty layer, Groq, Cerebras, Mistral, Deepseek and other such smaller providers as well as local solutions such as llama.cpp, ollama, LM Studio and vLLM.

  • As far as simplicity is concerned, perhaps rolling back the https plumbing and making it local LLM solutions via regular unencrypted http only would better suit your design philosophy given it would remove all the jargon surrounding root cert configuration?

  • Regardless of whether its merged or remains forked, I'll be shifting commentary surrounding the feature to a proper README update like you advised.

@ronron-gh
Copy link
Copy Markdown
Owner

Thank you for considering this. I understand. There seems to be a need for such an OpenAI-compatible endpoint, so I will consider merging it.
If being able to test it with a remote endpoint is useful and it doesn't affect other functions, I think it's fine to keep HTTPS.
Thank you also for considering updating the README.
I'm also interested and would like to try it myself if possible. Are there any of the options you listed that are easy to try?

@SimplePotat
Copy link
Copy Markdown
Author

SimplePotat commented May 24, 2026

Probably the easiest way to give it a try for yourself would be installing Ollama on your PC, launching it with "ollama serve gemma4:e2b" and then entering http://:11434/v1/chat/completions as the custom endpoint url in the config. I wouldn't recommend this setup as a daily driver but Ollama is the quickest backend in terms of setup and Gemma 4 E2B at Q4 isn't very smart at all but will run on any hardware and makes for a relatively small download.

For HTTPS my current implementation is a little simplistic in hindsight and I feel like I should probably add support for pinning multiple root certs before merging it since some of the providers I listed require multiple certificates and thus won't work with the current code, but if you want to try it out right now you can use Google's non-realtime API at https://generativelanguage.googleapis.com/v1beta/openai/chat/completions and plug in a copy of the root cert you already use for the realtime API.

If you want to set up a local model to use on the regular rather than just to test the PR I'd go with llama.cpp as the backend and my model recommendation would depend on your hardware specs.

@ronron-gh
Copy link
Copy Markdown
Owner

Thank you for your guidance. I've now set up an environment to run gemma4:e2b on ollama, so I'd like to try running the code you provided in the pull request.

@ronron-gh
Copy link
Copy Markdown
Owner

I tried it, but I had to modify the const String prompt as shown below. Even though it's OpenAI compatible, shouldn't adjustments be necessary for each model?

const String json_ChatString = 
- "{\"model\": \"gpt-4o\","
+ "{\"model\": \"gemma4:e2b\","
  "\"messages\": [{\"role\": \"system\", \"content\": \"\"},"     // ユーザーが設定するロール
                  "{\"role\": \"system\", \"content\": \"\"},"    // システム用のロール
                  "{\"role\": \"system\", \"content\": \"User Info: \"}],"  // 長期記憶の要約
+  "\"stream\": false",
  "\"functions\": [],"
  "\"function_call\":\"auto\""
"}";

@SimplePotat
Copy link
Copy Markdown
Author

SimplePotat commented May 29, 2026

Oh good catch and silly mistake on my part! I had it in my head that there was an option to change the model from the config file because of another local branch I'd been testing with. I'll add a new config option for it when I get home.

Shall I make it apply to the standard OpenAI plumbing as an optional override too? It'd make it a lot more flexible and future proof given 4o will inevitably be sunset sooner or later.

@ronron-gh
Copy link
Copy Markdown
Owner

Thank you. Indeed, it would be great if we could override the default 4o setting with an option in the configuration file.

…if filled and sets the model to be used with the custom endpoint option, otherwise ignored. Prints error if left empty when custom endpoint is selected.
@SimplePotat
Copy link
Copy Markdown
Author

Sorry for the holdup, busy couple of days. I've pushed the config update and I'll start work on updating the readme and making the security certificate selection a bit more robust shortly.

@SimplePotat
Copy link
Copy Markdown
Author

Alright, got provisional multi-certificate support for HTTPS endpoints in there now too. I'll update the documentation to reflect all the commits here tomorrow (may take a little bit as I want to write it out manually rather than using AI like I did for code comments).

…re, can be undone once support is implemented.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants