VertexAISearchSummaryTool drops SummaryWithMetadata citations/references despite summary_include_citations=True
Package versions
- langchain-google-community: 4.0.0
- google-cloud-discoveryengine: 0.20.0
- langchain-core: 1.4.4
- Python: 3.13.12
- langchain-google main SHA: 330f2df
Minimal corpus
Three direct-upload test documents under 100 KB:
Direct Discovery Engine request
- query: What are the current release codes and policy codes?
- filter: tenant: ANY("red")
- SummarySpec.include_citations: true
- SummarySpec.summary_result_count: 3
- data store type: unstructured
- serving config: Enterprise/LLM Discovery Engine app serving config
Note: I first tried a structured datastore serving config. Direct Search returned
summary_with_metadata, but skipped generation with LLM_ADDON_NOT_ENABLED, so
that was not used as the LangChain bug confirmation. The confirmed run used a
tiny direct-upload text datastore and an Enterprise Search app with
SEARCH_ADD_ON_LLM.
Direct Discovery Engine response evidence
- summary.summary_text present: True
- summary.summary_with_metadata present: True
- citation_metadata.citations present: True
- references present: True
- reference documents: ['projects/917493783080/locations/global/collections/default_collection/dataStores/summary-citation-parity-ds-r3/branches/0/documents/doc_red_v1']
- reference URIs: []
Because this was a direct raw-bytes text upload, Discovery Engine returned
references[*].document but did not populate references[*].uri.
Excerpt:
{
"summary": {
"summary_text": "The current red tenant release code is FALCON-17 [1]. The red policy code is R-ALLOW-101 [1].",
"summary_with_metadata": {
"summary": "The current red tenant release code is FALCON-17. The red policy code is R-ALLOW-101.",
"citation_metadata": {
"citations": [
{"end_index": "49", "sources": [{}]},
{"start_index": "50", "end_index": "85", "sources": [{}]}
]
},
"references": [
{
"document": "projects/917493783080/locations/global/collections/default_collection/dataStores/summary-citation-parity-ds-r3/branches/0/documents/doc_red_v1"
}
]
}
}
}
Full request/response JSON is in:
/Users/gabe/Desktop/agent-search-freshness/summary_citation_parity_runs/summary_citation_parity_20260610T233854282350Z/raw_calls.jsonl
LangChain code
from langchain_google_community.vertex_ai_search import VertexAISearchSummaryTool
tool = VertexAISearchSummaryTool(
project_id=PROJECT_ID,
location_id="global",
data_store_id=DATA_STORE_ID,
engine_data_type=0,
summary_result_count=3,
summary_include_citations=True,
)
result = tool.run('What are the current release codes and policy codes?')
The harness set the tool's private _serving_config to the engine/app serving
config above so the direct API and LangChain calls exercised the same
Enterprise/LLM Search request. The metadata loss itself is independent of that
override: current main still has _run() return only
response.summary.summary_text.
Current main, libs/community/langchain_google_community/vertex_ai_search.py:
def _run(self, user_query: str) -> str:
request = self._create_search_request(user_query)
response = self._client.search(request)
return response.summary.summary_text
LangChain actual output
- installed return type: str
- installed returned value: 'The current red tenant release code is FALCON-17 [1]. The red policy code is R-ALLOW-101 [1].'
- main return type: str
- main returned value: 'The current red tenant release code is FALCON-17 [1]. The red policy code is R-ALLOW-101 [1].'
The wrapped LangChain _client.search call received a raw SearchResponse with
summary_with_metadata, citation_metadata.citations, and references, but
VertexAISearchSummaryTool._run() returned only response.summary.summary_text.
Expected behavior
The tool should expose summary_with_metadata, citation metadata, and references,
or provide a structured artifact/return path when summary_include_citations=True.
Actual behavior
The tool discards the metadata and returns only summary_text.
Negative controls
- Direct google-cloud-discoveryengine Search API was correct for the same request.
- No ADK, ChatVertexAI, Gemini, Vertex model endpoint, Cloud Run, GCS, BigQuery,
Document AI, website crawl, PDF import, or LangSmith telemetry was used.
- Duplicate searches for
VertexAISearchSummaryTool + citations/references and
SearchResponse SummaryWithMetadata did not find an existing matching issue.
VertexAISearchSummaryTool drops SummaryWithMetadata citations/references despite summary_include_citations=True
Package versions
Minimal corpus
Three direct-upload test documents under 100 KB:
Direct Discovery Engine request
Note: I first tried a structured datastore serving config. Direct Search returned
summary_with_metadata, but skipped generation withLLM_ADDON_NOT_ENABLED, sothat was not used as the LangChain bug confirmation. The confirmed run used a
tiny direct-upload text datastore and an Enterprise Search app with
SEARCH_ADD_ON_LLM.Direct Discovery Engine response evidence
Because this was a direct raw-bytes text upload, Discovery Engine returned
references[*].documentbut did not populatereferences[*].uri.Excerpt:
{ "summary": { "summary_text": "The current red tenant release code is FALCON-17 [1]. The red policy code is R-ALLOW-101 [1].", "summary_with_metadata": { "summary": "The current red tenant release code is FALCON-17. The red policy code is R-ALLOW-101.", "citation_metadata": { "citations": [ {"end_index": "49", "sources": [{}]}, {"start_index": "50", "end_index": "85", "sources": [{}]} ] }, "references": [ { "document": "projects/917493783080/locations/global/collections/default_collection/dataStores/summary-citation-parity-ds-r3/branches/0/documents/doc_red_v1" } ] } } }Full request/response JSON is in:
/Users/gabe/Desktop/agent-search-freshness/summary_citation_parity_runs/summary_citation_parity_20260610T233854282350Z/raw_calls.jsonlLangChain code
The harness set the tool's private
_serving_configto the engine/app servingconfig above so the direct API and LangChain calls exercised the same
Enterprise/LLM Search request. The metadata loss itself is independent of that
override: current
mainstill has_run()return onlyresponse.summary.summary_text.Current
main,libs/community/langchain_google_community/vertex_ai_search.py:LangChain actual output
The wrapped LangChain
_client.searchcall received a raw SearchResponse withsummary_with_metadata,citation_metadata.citations, andreferences, butVertexAISearchSummaryTool._run()returned onlyresponse.summary.summary_text.Expected behavior
The tool should expose
summary_with_metadata, citation metadata, and references,or provide a structured artifact/return path when
summary_include_citations=True.Actual behavior
The tool discards the metadata and returns only
summary_text.Negative controls
Document AI, website crawl, PDF import, or LangSmith telemetry was used.
VertexAISearchSummaryTool+ citations/references andSearchResponse SummaryWithMetadatadid not find an existing matching issue.