Skip to content

[bot] Add instrumentation support for kalosm local/cloud AI toolkit #55

@braintrust-bot

Description

@braintrust-bot

Summary

The Braintrust Rust SDK has no instrumentation support for kalosm, a Rust-native AI toolkit backed by HuggingFace that provides high-level ChatModel, TextCompletionModel, and Embedder execution traits for both local (Llama, Bert) and cloud-compatible (OpenAI-compatible, Anthropic-compatible) backends. No other Braintrust SDK instruments this library, as it is Rust-specific, but Braintrust documents HuggingFace as a supported TypeScript integration.

What is missing

kalosm::language defines clear execution-oriented traits that could be instrumented to automatically capture spans:

  • ChatModel / ChatModelExtchat() async method for multi-turn conversation, with streaming via CreateChatSession
  • TextCompletionModel / TextCompletionModelExtstream_text() for streaming text generation
  • StructuredChatModel / StructuredTextCompletionModel — structured/typed output generation
  • Embedder / EmbedderExtembed() for converting text to embedding vectors

A wrapper or span helper around these traits would enable automatic tracing of kalosm-based Rust applications, capturing inputs, outputs, latency, time-to-first-token, and (where available) token counts for any model backend kalosm supports.

The SDK already has building blocks that could support this:

  • wrap_stream_with_span() in src/stream.rs for streaming instrumentation
  • SpanBuilder / SpanHandle in src/span.rs for manual span creation

But there is no integration that connects these to kalosm's execution traits.

Library significance

  • 2.2k GitHub stars in the floneum monorepo (44 open issues, active development)
  • v0.4.0 stable release
  • Supports local model backends: Llama (text generation), Bert (embeddings), Whisper (audio), Stable Diffusion (image)
  • Supports cloud-compatible endpoints: OpenAI-compatible and Anthropic-compatible APIs via the same interface
  • Includes RAG workflows via vector stores (SurrealDB integration), document chunking, and semantic search
  • Official docs: https://docs.rs/kalosm/latest/kalosm/

Braintrust docs status

not_found — kalosm is not listed in Braintrust's SDK integrations, AI providers page, or trace LLM calls documentation. Braintrust does list Hugging Face as a supported TypeScript integration (https://www.braintrust.dev/docs/instrument/trace-llm-calls), and kalosm is the primary HuggingFace-backed local inference framework for Rust.

Upstream sources

Local files inspected

  • src/extractors.rs — only extract_openai_usage() and extract_anthropic_usage() exist; no kalosm-related code
  • src/stream.rswrap_stream_with_span() is generic over Stream<Item = Result<Value, E>>; kalosm streams yield tokens directly rather than JSON chunks, requiring a different adapter
  • src/lib.rs — public API exports; no kalosm references
  • Cargo.toml — no kalosm dependency
  • Full codebase grep for kalosm, ChatModel, TextCompletionModel, Embedder — zero results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions