Preserve hidden state fields in chat completions by satojandro · Pull Request #679 · nearai/cloud-api

satojandro · 2026-05-26T20:27:56Z

Summary

Adds support for hidden-state passthrough in chat completions.

Adds explicit return_hidden_states and layers request fields to ChatCompletionParams
Adds flattened extra catch-all fields to ChatCompletionChunk and ChatCompletionResponse
Preserves provider-specific response fields such as hidden_states during deserialize/reserialize
Updates affected struct literals with default extra values
Adds regression coverage for unknown field round-tripping on streamed chunks and full responses

Testing

cargo fmt --check
cargo test -p inference_providers
cargo check
git diff --check

This mirrors the existing ChatDelta catch-all behavior so unknown upstream fields are not silently dropped.

gemini-code-assist

Code Review

This pull request introduces return_hidden_states and layers fields to ChatCompletionParams to support requesting per-layer hidden state activations from backends like sglang. It also adds an extra map to ChatCompletionChunk and ChatCompletionResponse to preserve provider-specific fields during serialization and deserialization. The review feedback correctly identifies that the newly added fields are hardcoded to None during request conversions and service mappings, which prevents them from being populated from the incoming request's extra map. Actionable code suggestions are provided to extract these fields dynamically.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Copilot

Pull request overview

This PR extends the chat-completions normalization layer to support hidden-state passthrough end-to-end, ensuring provider-specific response fields (e.g., hidden_states) are preserved when cloud-api deserializes and reserializes both full responses and streaming chunks.

Changes:

Added return_hidden_states and layers to ChatCompletionParams to explicitly request hidden states from backends that support it.
Added flattened extra maps to ChatCompletionChunk and ChatCompletionResponse to round-trip unknown/provider-specific response fields.
Updated struct literals and added regression tests to ensure unknown fields survive deserialize/reserialize for both streaming and non-streaming responses.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
crates/services/src/inference_provider_pool/mod.rs	Updates test params to include new `ChatCompletionParams` fields.
crates/services/src/completions/mod.rs	Extracts `return_hidden_states` / `layers` from request `extra` when building provider params.
crates/inference_providers/tests/integration_tests.rs	Updates integration tests to populate new request fields.
crates/inference_providers/src/vllm/mod.rs	Updates vLLM tests to include new request fields.
crates/inference_providers/src/models.rs	Adds new request fields, adds `extra` passthrough on chunk/response, and adds round-trip regression tests; fixes doctest import path.
crates/inference_providers/src/mock.rs	Updates mock response builders to initialize new `extra` field.
crates/inference_providers/src/external/openai_compatible.rs	Updates tests to include new request fields.
crates/inference_providers/src/external/gemini/mod.rs	Ensures synthesized responses initialize new `extra` field.
crates/inference_providers/src/external/anthropic/mod.rs	Ensures synthesized responses initialize new `extra` field; updates tests to include new request fields.
crates/inference_providers/src/chunk_builder.rs	Ensures built chunks initialize new `extra` field.
crates/api/src/routes/completions.rs	Ensures SSE flush chunks initialize new `extra` field.
crates/api/src/conversions.rs	Extracts `return_hidden_states` / `layers` from API request `extra` when converting to provider params.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            return_hidden_states: extra.remove("return_hidden_states").and_then(|v| v.as_bool()),
+            layers: extra.remove("layers").and_then(|v| serde_json::from_value(v).ok()),


+            return_hidden_states: extra.remove("return_hidden_states").and_then(|v| v.as_bool()),
+            layers: extra.remove("layers").and_then(|v| serde_json::from_value(v).ok()),


+            return_hidden_states: extra.remove("return_hidden_states").and_then(|v| v.as_bool()),
+            layers: extra.remove("layers").and_then(|v| serde_json::from_value(v).ok()),


Evrard-Nil · 2026-05-27T11:59:35Z

Thanks for the PR, no backend has --enable-return-hidden-states currently so this would be a no-op. May I ask what's your use case?

satojandro · 2026-05-27T21:12:53Z

Hey mate, thanks for the prompt review and reply. I'm doing mechanistic interpretability research that requires per-layer activation from transformer models. gland and vLLM both support returning hidden stated natively, so the main gap is the proxy layer stripping them them before they reach the client(?)

I understand that this may be currently a no-op, but this was actually suggested to be by Illia, so hope we can find a way to enable it. Happy to have a chat on telegram or call @Evrard-Nil

Preserve hidden state fields in chat completions

46b97fd

gemini-code-assist Bot reviewed May 26, 2026

View reviewed changes

Comment thread crates/api/src/conversions.rs Outdated

Comment thread crates/services/src/completions/mod.rs Outdated

Comment thread crates/services/src/completions/mod.rs Outdated

satojandro marked this pull request as draft May 26, 2026 20:43

satojandro and others added 3 commits May 26, 2026 17:47

Update crates/api/src/conversions.rs

16ac7d0

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update crates/services/src/completions/mod.rs

295a662

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update crates/services/src/completions/mod.rs

dba90c5

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

satojandro marked this pull request as ready for review May 26, 2026 20:48

Evrard-Nil requested a review from Copilot May 27, 2026 09:06

Copilot started reviewing on behalf of Evrard-Nil May 27, 2026 09:06 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve hidden state fields in chat completions#679

Preserve hidden state fields in chat completions#679
satojandro wants to merge 4 commits into
nearai:mainfrom
satojandro:main

satojandro commented May 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Evrard-Nil commented May 27, 2026

Uh oh!

satojandro commented May 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		return_hidden_states: extra.remove("return_hidden_states").and_then(\|v\| v.as_bool()),
		layers: extra.remove("layers").and_then(\|v\| serde_json::from_value(v).ok()),

Conversation

satojandro commented May 26, 2026

Summary

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Evrard-Nil commented May 27, 2026

Uh oh!

satojandro commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

satojandro commented May 27, 2026 •

edited

Loading