ChatMistralAI incompatible token output format for LangSmith traceability

### Submission checklist

- [x] This is a bug, not a usage question.
- [x] I added a clear and descriptive title that summarizes this issue.
- [x] I used the GitHub search to find a similar question and didn't find it.
- [x] I am sure that this is a bug in LangChain rather than my code.
- [x] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
- [x] This is not related to the langchain-community package.
- [x] I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

### Package (Required)

- [ ] langchain
- [ ] langchain-openai
- [ ] langchain-anthropic
- [ ] langchain-classic
- [ ] langchain-core
- [ ] langchain-model-profiles
- [ ] langchain-tests
- [ ] langchain-text-splitters
- [ ] langchain-chroma
- [ ] langchain-deepseek
- [ ] langchain-exa
- [ ] langchain-fireworks
- [ ] langchain-groq
- [ ] langchain-huggingface
- [x] langchain-mistralai
- [ ] langchain-nomic
- [ ] langchain-ollama
- [ ] langchain-openrouter
- [ ] langchain-perplexity
- [ ] langchain-qdrant
- [ ] langchain-xai
- [ ] Other / not sure / general

### Related Issues / PRs

_No response_

### Reproduction Steps / Example Code (Python)

```python
import asyncio
import json
import os
import random
import time
from decimal import Decimal

from dotenv import load_dotenv
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_mistralai.chat_models import ChatMistralAI
from langsmith import Client, traceable

load_dotenv()

SYSTEM_PROMPT = """You are a knowledgeable travel planning assistant. Your role is to help users plan detailed itineraries for trips around the world. You understand geography, local customs, transportation options, seasonal weather patterns, and popular attractions. You can suggest accommodations ranging from budget hostels to luxury resorts, recommend local cuisine and restaurants, and help optimize travel routes for efficiency and enjoyment. You always consider the traveler's budget, interests, physical abilities, and time constraints when making recommendations. You provide practical tips about visa requirements, currency exchange, and local etiquette. You prioritize authentic local experiences over tourist traps whenever possible."""

USER_MESSAGES = [
    """I'm planning a two-week trip to Japan in April. I'll be arriving in Tokyo and want to experience both the major cities and some rural areas. My budget is moderate — I'm happy to stay in a mix of traditional ryokans and business hotels. I'm particularly interested in cherry blossom viewing, traditional temples, and local food culture. I also enjoy hiking and would love to include at least one or two nature-focused days. Can you help me outline a rough itinerary that balances urban exploration with countryside experiences?""",
    """Thanks for the suggestions. I've decided to spend the first four days in Tokyo, then head to Hakone for a night before going to Kyoto. For Tokyo, I want to cover Shibuya, Shinjuku, Asakusa, and Akihabara at minimum. I'm also curious about the Tsukiji outer market for breakfast — is it still worth visiting or has everything moved to Toyosu? I heard the tuna auction is at the new location now. Also, for transportation between cities, should I get a 14-day Japan Rail Pass or would individual shinkansen tickets be more cost-effective for my specific route? I'll be traveling with one large suitcase and a backpack.""",
    """Great advice on the rail pass — I'll go with the 14-day option. Now for the Kyoto portion, I'm thinking of spending five days there using it as a base for day trips. I definitely want to visit Fushimi Inari early in the morning to avoid crowds, and I'd like to see Arashiyama bamboo grove as well. I've also heard that Nara is an easy day trip from Kyoto with the famous deer park and Todai-ji temple. For one of the days, I'm considering a day trip to Hiroshima and Miyajima Island — is that feasible in a single day from Kyoto? And what about the food scene in Kyoto — any must-try dishes that are specific to the Kansai region?""",
    """Perfect, the Hiroshima day trip sounds doable. For the final stretch of the trip, I'm torn between spending the last three days in Osaka or splitting them between Osaka and a more off-the-beaten-path destination like Kanazawa or Takayama. I love street food so Osaka's Dotonbori appeals to me, but I also want to avoid ending the trip in just another big city. Kanazawa's Kenroku-en garden and the samurai district sound fascinating, and Takayama's old town and morning markets seem charming. What would you recommend given that I'll already have had plenty of urban time in Tokyo and Kyoto? Also, are there any festivals or special events happening in late April in any of these areas?""",
]

CACHE_KEY = f"test-mistral-cache-demo-{random.randint(0, 1000000)}"
MODEL = "mistral-small-latest"


@traceable(name="mistral-cache-test", run_type="chain")
async def run_conversation():
    model = ChatMistralAI(
        name=MODEL,
        api_key=os.getenv("MISTRAL_API_KEY"),
        model=MODEL,
        temperature=0.1,
    )

    messages = [SystemMessage(content=SYSTEM_PROMPT)]
    all_responses = []

    for i, user_msg in enumerate(USER_MESSAGES):
        messages.append(HumanMessage(content=user_msg))

        print(f"\n{'='*80}")
        print(f"MESSAGE {i+1}")
        print(f"{'='*80}")

        response = await model.ainvoke(messages, prompt_cache_key=CACHE_KEY)

        print(
            json.dumps(
                {
                    "content": response.content[:200] + "..."
                    if len(response.content) > 200
                    else response.content,
                    "usage_metadata": response.usage_metadata,
                    "response_metadata": response.response_metadata,
                    "type": response.type,
                    "id": response.id,
                },
                indent=2,
                default=str,
            )
        )

        all_responses.append(response)
        messages.append(response)

    return all_responses


PRICE_INPUT = 0.15 / 1_000_000  # $0.15 per 1M input tokens
PRICE_CACHED_INPUT = 0.015 / 1_000_000  # $0.015 per 1M cached input tokens
PRICE_OUTPUT = 0.6 / 1_000_000  # $0.60 per 1M output tokens


async def main():
    responses = await run_conversation()

    # --- Compute actual cost from Mistral response metadata ---
    actual_total_input = 0
    actual_total_cached = 0
    actual_total_output = 0

    print(f"\n\n{'='*80}")
    print("ACTUAL TOKEN USAGE (from Mistral response_metadata)")
    print(f"{'='*80}")

    for i, resp in enumerate(responses):
        token_usage = resp.response_metadata.get("token_usage", {})
        prompt_tokens = token_usage.get("prompt_tokens", 0)
        completion_tokens = token_usage.get("completion_tokens", 0)
        cached_tokens = token_usage.get("prompt_tokens_details", {}).get("cached_tokens", 0)
        non_cached_input = prompt_tokens - cached_tokens

        actual_total_input += non_cached_input
        actual_total_cached += cached_tokens
        actual_total_output += completion_tokens

        print(
            f"\n  Run {i+1}: input={prompt_tokens} (cached={cached_tokens}, non-cached={non_cached_input}), output={completion_tokens}"
        )

    actual_input_cost = actual_total_input * PRICE_INPUT
    actual_cached_cost = actual_total_cached * PRICE_CACHED_INPUT
    actual_output_cost = actual_total_output * PRICE_OUTPUT
    actual_total_cost = actual_input_cost + actual_cached_cost + actual_output_cost

    print(
        f"\n  Totals: non-cached input={actual_total_input}, cached input={actual_total_cached}, output={actual_total_output}"
    )
    print(
        f"  Cost:   input=${actual_input_cost:.6f} + cached=${actual_cached_cost:.6f} + output=${actual_output_cost:.6f} = ${actual_total_cost:.6f}"
    )

    # --- Fetch Langsmith costs ---
    print("\n\nWaiting 20s for traces to flush to Langsmith...")
    time.sleep(20)

    client = Client()
    project_name = os.getenv("LANGSMITH_PROJECT", "default")

    runs = list(
        client.list_runs(
            project_name=project_name,
            filter='eq(name, "mistral-cache-test")',
            limit=1,
        )
    )

    if not runs:
        print("ERROR: Could not find the parent trace in Langsmith")
        return

    parent_run = runs[0]
    print(f"\nTrace ID: {parent_run.id}")
    print(f"Trace URL: {parent_run.url}")

    # Use trace_id to find all LLM runs in the trace tree
    child_runs = list(
        client.list_runs(
            project_name=project_name,
            trace_id=parent_run.trace_id,
            run_type="llm",
        )
    )
    child_runs.sort(key=lambda r: r.start_time)

    ls_total_input_cost = Decimal(0)
    ls_total_output_cost = Decimal(0)
    ls_total_cost = Decimal(0)

    print(f"\n{'='*80}")
    print(f"LANGSMITH TOKEN USAGE & COSTS ({len(child_runs)} LLM runs)")
    print(f"{'='*80}")

    for i, run in enumerate(child_runs):
        input_t = run.prompt_tokens or 0
        output_t = run.completion_tokens or 0
        run_total_cost = run.total_cost or Decimal(0)
        run_input_cost = run.prompt_cost or Decimal(0)
        run_output_cost = run.completion_cost or Decimal(0)

        ls_total_cost += run_total_cost
        ls_total_input_cost += run_input_cost
        ls_total_output_cost += run_output_cost

        print(f"\n  Run {i+1}: input={input_t}, output={output_t}")
        print(f"    input_cost=${run_input_cost:.6f}, output_cost=${run_output_cost:.6f}, total_cost=${run_total_cost:.6f}")

    print(
        f"\n  Langsmith totals: input_cost=${ls_total_input_cost:.6f}, output_cost=${ls_total_output_cost:.6f}, total=${ls_total_cost:.6f}"
    )

    # --- Comparison ---
    print(f"\n\n{'='*80}")
    print("COMPARISON: Actual (with cache pricing) vs Langsmith")
    print(f"{'='*80}")
    ls_in = float(ls_total_input_cost)
    ls_out = float(ls_total_output_cost)
    ls_tot = float(ls_total_cost)

    print(f"  {'':30s} {'Actual':>12s}  {'Langsmith':>12s}  {'Difference':>12s}")
    print(f"  {'Input cost':30s} ${actual_input_cost:>11.6f}  ${ls_in:>11.6f}  ${ls_in - actual_input_cost:>+11.6f}")
    print(f"  {'Cached input cost':30s} ${actual_cached_cost:>11.6f}  {'$       N/A':>12s}  {'':>12s}")
    print(f"  {'Output cost':30s} ${actual_output_cost:>11.6f}  ${ls_out:>11.6f}  ${ls_out - actual_output_cost:>+11.6f}")
    print(f"  {'TOTAL':30s} ${actual_total_cost:>11.6f}  ${ls_tot:>11.6f}  ${ls_tot - actual_total_cost:>+11.6f}")
    if ls_tot and actual_total_cost:
        overcharge_pct = (ls_tot - actual_total_cost) / actual_total_cost * 100
        print(f"\n  Langsmith overestimates cost by {overcharge_pct:+.1f}% (treats cached tokens as full-price input)")
    elif not ls_tot:
        print("\n  NOTE: Langsmith returned $0 total cost — it may not have pricing data for this Mistral model yet.")


if __name__ == "__main__":
    asyncio.run(main())
```

### Error Message and Stack Trace (if applicable)

```shell

```

### Description

Token usage reporting in ChatMistralAI output is deprecated (I think), and LangSmith is failing to parse the "cached_tokens" field. 

I appended a Python script that displays cached tokens in sequential invokes, and compares costs calculated manually vs Langsmith. 

This means that I cannot get correct price information in LangSmith, and the problem stems from the output format that comes from ChatMistralAI. 


Here is the model pricing configuration from Langsmith:
<img width="481" height="699" alt="Image" src="https://github.com/user-attachments/assets/f5bdf1ee-d1de-4e9e-aa08-3d5582ad733c" />


### System Info




System Information
------------------
> OS:  Darwin
> OS Version:  Darwin Kernel Version 25.4.0: Thu Mar 19 19:33:25 PDT 2026; root:xnu-12377.101.15~1/RELEASE_ARM64_T6041
> Python Version:  3.11.9 (v3.11.9:de54cf5be3, Apr  2 2024, 07:12:50) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information
-------------------
> langchain_core: 1.4.8
> langchain: 1.2.15
> langchain_community: 0.4.1
> langsmith: 0.7.32
> langchain_anthropic: 1.4.1
> langchain_classic: 1.0.4
> langchain_google_genai: 4.2.2
> langchain_google_vertexai: 3.2.2
> langchain_mistralai: 1.1.5
> langchain_ollama: 1.1.0
> langchain_openai: 1.1.14
> langchain_protocol: 0.0.18
> langchain_tests: 1.1.6
> langchain_text_splitters: 1.1.2
> langgraph_sdk: 0.3.13

Optional packages not installed
-------------------------------
> deepagents
> deepagents-cli

Other Dependencies
------------------
> aiohttp: 3.13.5
> anthropic: 0.96.0
> bottleneck: 1.6.0
> claude-agent-sdk: 0.1.48
> dataclasses-json: 0.6.7
> filetype: 1.2.0
> google-cloud-aiplatform: 1.148.0
> google-cloud-storage: 3.10.1
> google-cloud-vectorsearch: 0.10.0
> google-genai: 1.73.1
> httpx: 0.28.1
> httpx-sse: 0.4.3
> jsonpatch: 1.33
> langgraph: 1.1.7
> numexpr: 2.14.1
> numpy: 2.4.4
> ollama: 0.6.1
> openai: 2.32.0
> opentelemetry-api: 1.41.0
> opentelemetry-exporter-otlp-proto-http: 1.41.0
> opentelemetry-sdk: 1.41.0
> orjson: 3.11.8
> packaging: 25.0
> pyarrow: 22.0.0
> pydantic: 2.13.2
> pydantic-settings: 2.13.1
> pytest: 8.4.2
> pytest-asyncio: 1.3.0
> pytest-benchmark: 5.2.3
> pytest-codspeed: 4.4.0
> pytest-recording: 0.13.4
> pytest-socket: 0.7.0
> pyyaml: 6.0.3
> PyYAML: 6.0.3
> requests: 2.33.1
> requests-toolbelt: 1.0.0
> rich: 14.3.4
> sqlalchemy: 2.0.49
> SQLAlchemy: 2.0.49
> syrupy: 5.1.0
> tenacity: 9.1.4
> tiktoken: 0.12.0
> tokenizers: 0.23.1
> typing-extensions: 4.15.0
> uuid-utils: 0.14.1
> validators: 0.35.0
> vcrpy: 8.1.1
> websockets: 16.0
> wrapt: 1.17.3
> xxhash: 3.6.0
> zstandard: 0.23.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ChatMistralAI incompatible token output format for LangSmith traceability #38384

Submission checklist

Package (Required)

Related Issues / PRs

Reproduction Steps / Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

ChatMistralAI incompatible token output format for LangSmith traceability #38384

Description

Submission checklist

Package (Required)

Related Issues / PRs

Reproduction Steps / Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions