Skip to content

Releases: deths74r/llaml

v0.1.5 — Gemini multi-tool-call streaming fix

16 Apr 05:22

Choose a tag to compare

Bug fix: Gemini tool calls lost their arguments when multiple were emitted in one response

Found via LMI (deths74r/lmi) dogfood on 2026-04-16. A gemini-pro-latest session made 15 tool calls in one response; all surfaced with empty `input = {}` and every tool rejected the calls.

Root cause

1. Streaming decoder hardcoded `index = 0`. `gemini_codec.ml:260-266` built `tool_call_deltas` with every delta sharing index 0. Consumers that key an accumulator by `td.index` (e.g. LMI's `llaml_provider.ml`) had N calls collapse into one slot:

  • id/name overwritten by last writer
  • All deltas appended to the SAME args buffer → `{"path":"a"}{"path":"b"}{"path":"c"}`
  • JSON parse failed → fallback ``Assoc []` → empty input

2. Opaque `id` field ignored. The code assumed "Gemini does not assign unique ids" and hardcoded `id = name`. That's now false — Gemini 2.5+ and 3.x-preview responses include per-call `"id": "xvg3g7sk"` fields.

Fix

  • `List.mapi` over `Tool_use` items so each gets a unique ascending index.
  • Read the `id` field when present, fall back to name for older models.

Tests

44 → 47 (3 new regression tests).

Impact

This broke every multi-tool-call turn under Gemini streaming. Any llaml consumer (LMI, other downstream projects) was affected. No config change needed to pick up the fix — just bump the pin.

llaml v0.1.0

12 Apr 15:43

Choose a tag to compare

llaml v0.1.0 — Canonical-type LLM client for OCaml

First tagged release. One set of canonical types across 13 LLM providers.

Features

  • 4 native wire protocols: OpenAI, Anthropic, Gemini, AWS Bedrock
  • 9 OpenAI-compatible profiles: Together, Fireworks, Groq, DeepSeek, xAI, Cerebras, Ollama, OpenRouter, Mistral
  • Streaming: SSE + AWS binary event-stream as canonical chunk deltas
  • Tool calling: unified across all providers
  • Reasoning effort: reasoning field maps to Gemini thinkingConfig, Anthropic extended thinking, OpenAI reasoning_effort
  • Prompt caching: Anthropic cache_control on messages
  • Request knobs: top_k, seed, response_format (Fmt_text / Fmt_json_object / Fmt_json_schema), safety_settings
  • Router: model groups with Simple_shuffle / Lowest_latency / Lowest_tpm_rpm strategies, cooldowns, fallback chains
  • AWS SigV4: full request signing for Bedrock (no SDK dependency)
  • Plug-and-play: Llaml_eio.make hides the functor, Llaml_eio.auto ~model resolves provider + auth from a model name
  • Dep-light core: yojson + uri + digestif + base64 + str + unix. Eio/cohttp/TLS confined to the llaml_eio sub-library.
  • 44 unit tests, builds clean on OCaml 5.2

Quick start

Eio_main.run @@ fun env ->
Eio.Switch.run @@ fun sw ->
match Llaml_eio.auto ~env ~sw ~model:"claude-sonnet-4-5" () with
| Error msg -> prerr_endline msg
| Ok client ->
  match client.complete (Llaml.Types.request ~model:"claude-sonnet-4-5" ~messages:[...] ()) with
  | Ok resp -> (* ... *)
  | Error e -> Format.eprintf "%a@." Llaml.Types.pp_error e

Ships as the provider layer for LMI v0.1.0.