-
Notifications
You must be signed in to change notification settings - Fork 1
LLM findings
This is a list of things we have discovered about specific models and setups during this project. It is likely that our use of Langchain, and our system prompt, is a factor in some of these discoveries.
-
One ongoing challenge is finding models that both reliably call the correct tools, and also return links in the correct format.
-
o4-miniwas our default choice for a long time. Following the switch to LiteLLM it got significantly slower, to the point it was causing problems for users. This seems to have been a gradual decline following the LiteLLM switch - it didn't happen immediately. -
We looked at a switch from Langchain to Vercel AI SDK. This would have simplified the streaming code, but we found it didn't work with calling MCP tools with the OpenAI reasoning models. It may be good to revisit this now that we're looking at models from other vendors.
-
Claude Sonnet 3.7is our current default as this provide good accuracy combined with speed.Claude Sonnet 4.5, however, does not call tools. -
We had
Claude 3 Haikuas a 'fast mode' option for a while. This worked well with tool calls, but didn't provide links. -
Gemini Flash 2.5also seems to be a good performer.