Skip to content

Commit 5a3b932

Browse files
bugerclaude
andauthored
feat: structured delegate response, LLM-based semantic dedup, shared provider utility (#541)
* feat: structured delegate response with searches tracking and relevance filtering - Search delegate now returns structured JSON with confidence, groups, and a new `searches` field listing all queries made (with path and outcome). This lets the parent AI see what was attempted and retry different terms. - Delegate prompt emphasizes relevance filtering: only include files verified by extract, not keyword matches. Fewer verified files > many unverified matches. - Code-searcher subagent iteration limit handling: when hitting maxIterations, the last-iteration prompt tells the subagent to output structured JSON with partial results (not a text error). Post-loop fallback also produces structured JSON with search history for code-searcher agents. - Separate searchDelegateSchema (query + path only) for delegate mode, preventing the parent from passing keyword-specific params like exact/language. - Delegate-level dedup prevents re-spawning expensive subagents for the same normalized concept. - System prompts updated: search delegate returns file locations (not extracted code blocks), parent must use extract() to read code. - Delegate span enrichment: logs truncated response on success spans. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: restructure search delegate prompt with XML tags Reorganize the delegate subagent prompt into clear XML sections: <role>, <task>, <tools>, <search-engine-behavior>, <strategy>, <relevance-filtering>, <stop-conditions>, <on-iteration-limit>, <output-format>, and <output-guidelines> with <field> elements. Makes the prompt easier to parse for both humans and LLMs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: LLM-based semantic dedup, shared provider utility, delegate improvements - Add LLM-based semantic dedup for delegate queries (checkDelegateDedup) using generateText to detect redundant searches across all paths - Create shared utils/provider.js with createProviderInstance(), resolveApiKey(), createLanguageModel(), and DEFAULT_MODELS - Refactor FallbackManager and ProbeAgent to use shared provider utility, eliminating duplicated SDK imports and provider creation logic - Add 'reason' field to structured delegate response schema - Remove redundant <output-format> and <output-guidelines> from delegate prompt, replace with concise <output-rules> - Add OTEL tracing for dedup decisions (search.delegate.dedup spans) - Fix dedup model creation: fall back to FORCE_PROVIDER env var and DEFAULT_MODELS when options.provider/model unavailable at init time - Return NOT FOUND verdict when delegate reports low confidence with empty groups instead of falling back to raw search - Add parent prompt guidance to prevent re-searching same concepts - Add 'reason' to iteration-limit fallback response - Add tests for filler prefix stripping in normalizeQueryConcept - Add tests for reason field in structured delegate response Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent d294001 commit 5a3b932

9 files changed

Lines changed: 908 additions & 407 deletions

File tree

npm/src/agent/FallbackManager.js

Lines changed: 3 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,7 @@
88
* - Custom fallback chains with full configuration
99
*/
1010

11-
import { createAnthropic } from '@ai-sdk/anthropic';
12-
import { createOpenAI } from '@ai-sdk/openai';
13-
import { createGoogleGenerativeAI } from '@ai-sdk/google';
14-
import { createAmazonBedrock } from '@ai-sdk/amazon-bedrock';
11+
import { createProviderInstance, DEFAULT_MODELS as SHARED_DEFAULT_MODELS } from '../utils/provider.js';
1512

1613
/**
1714
* Fallback strategies
@@ -40,12 +37,7 @@ export const FALLBACK_STRATEGIES = {
4037
/**
4138
* Default model mappings for each provider
4239
*/
43-
const DEFAULT_MODELS = {
44-
anthropic: 'claude-sonnet-4-6',
45-
openai: 'gpt-5.2',
46-
google: 'gemini-2.5-flash',
47-
bedrock: 'anthropic.claude-sonnet-4-6'
48-
};
40+
const DEFAULT_MODELS = SHARED_DEFAULT_MODELS;
4941

5042
/**
5143
* FallbackManager class for handling provider and model fallback
@@ -138,53 +130,7 @@ export class FallbackManager {
138130
*/
139131
_createProviderInstance(config) {
140132
try {
141-
switch (config.provider) {
142-
case 'anthropic':
143-
return createAnthropic({
144-
apiKey: config.apiKey,
145-
...(config.baseURL && { baseURL: config.baseURL })
146-
});
147-
148-
case 'openai':
149-
return createOpenAI({
150-
compatibility: 'strict',
151-
apiKey: config.apiKey,
152-
...(config.baseURL && { baseURL: config.baseURL })
153-
});
154-
155-
case 'google':
156-
return createGoogleGenerativeAI({
157-
apiKey: config.apiKey,
158-
...(config.baseURL && { baseURL: config.baseURL })
159-
});
160-
161-
case 'bedrock': {
162-
const bedrockConfig = {};
163-
164-
if (config.apiKey) {
165-
bedrockConfig.apiKey = config.apiKey;
166-
} else if (config.accessKeyId && config.secretAccessKey) {
167-
bedrockConfig.accessKeyId = config.accessKeyId;
168-
bedrockConfig.secretAccessKey = config.secretAccessKey;
169-
if (config.sessionToken) {
170-
bedrockConfig.sessionToken = config.sessionToken;
171-
}
172-
}
173-
174-
if (config.region) {
175-
bedrockConfig.region = config.region;
176-
}
177-
178-
if (config.baseURL) {
179-
bedrockConfig.baseURL = config.baseURL;
180-
}
181-
182-
return createAmazonBedrock(bedrockConfig);
183-
}
184-
185-
default:
186-
throw new Error(`FallbackManager: Unknown provider "${config.provider}"`);
187-
}
133+
return createProviderInstance(config);
188134
} catch (error) {
189135
// Re-throw with more context
190136
const providerName = this._getProviderDisplayName(config);

npm/src/agent/ProbeAgent.js

Lines changed: 48 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,7 @@ export const ENGINE_ACTIVITY_TIMEOUT_MIN = 5000;
2727
*/
2828
export const ENGINE_ACTIVITY_TIMEOUT_MAX = 600000;
2929

30-
import { createAnthropic } from '@ai-sdk/anthropic';
31-
import { createOpenAI } from '@ai-sdk/openai';
32-
import { createGoogleGenerativeAI } from '@ai-sdk/google';
33-
import { createAmazonBedrock } from '@ai-sdk/amazon-bedrock';
30+
import { createProviderInstance, DEFAULT_MODELS } from '../utils/provider.js';
3431
import { streamText, generateText, tool, stepCountIs, jsonSchema, Output } from 'ai';
3532
import { randomUUID } from 'crypto';
3633
import { EventEmitter } from 'events';
@@ -1673,13 +1670,10 @@ export class ProbeAgent {
16731670
* Initialize Anthropic model
16741671
*/
16751672
initializeAnthropicModel(apiKey, apiUrl, modelName) {
1676-
this.provider = createAnthropic({
1677-
apiKey: apiKey,
1678-
...(apiUrl && { baseURL: apiUrl }),
1679-
});
1680-
this.model = modelName || 'claude-sonnet-4-6';
1673+
this.provider = createProviderInstance({ provider: 'anthropic', apiKey, ...(apiUrl && { baseURL: apiUrl }) });
1674+
this.model = modelName || DEFAULT_MODELS.anthropic;
16811675
this.apiType = 'anthropic';
1682-
1676+
16831677
if (this.debug) {
16841678
console.log(`Using Anthropic API with model: ${this.model}${apiUrl ? ` (URL: ${apiUrl})` : ''}`);
16851679
}
@@ -1689,14 +1683,10 @@ export class ProbeAgent {
16891683
* Initialize OpenAI model
16901684
*/
16911685
initializeOpenAIModel(apiKey, apiUrl, modelName) {
1692-
this.provider = createOpenAI({
1693-
compatibility: 'strict',
1694-
apiKey: apiKey,
1695-
...(apiUrl && { baseURL: apiUrl }),
1696-
});
1697-
this.model = modelName || 'gpt-5.2';
1686+
this.provider = createProviderInstance({ provider: 'openai', apiKey, ...(apiUrl && { baseURL: apiUrl }) });
1687+
this.model = modelName || DEFAULT_MODELS.openai;
16981688
this.apiType = 'openai';
1699-
1689+
17001690
if (this.debug) {
17011691
console.log(`Using OpenAI API with model: ${this.model}${apiUrl ? ` (URL: ${apiUrl})` : ''}`);
17021692
}
@@ -1706,10 +1696,7 @@ export class ProbeAgent {
17061696
* Initialize Google model
17071697
*/
17081698
initializeGoogleModel(apiKey, apiUrl, modelName) {
1709-
this.provider = createGoogleGenerativeAI({
1710-
apiKey: apiKey,
1711-
...(apiUrl && { baseURL: apiUrl }),
1712-
});
1699+
this.provider = createProviderInstance({ provider: 'google', apiKey, ...(apiUrl && { baseURL: apiUrl }) });
17131700
this.model = modelName || 'gemini-2.5-pro';
17141701
this.apiType = 'google';
17151702

@@ -2245,32 +2232,10 @@ export class ProbeAgent {
22452232
* Initialize AWS Bedrock model
22462233
*/
22472234
initializeBedrockModel(accessKeyId, secretAccessKey, region, sessionToken, apiKey, baseURL, modelName) {
2248-
// Build configuration object, only including defined values
2249-
const config = {};
2250-
2251-
// Authentication - prefer API key if provided, otherwise use AWS credentials
2252-
if (apiKey) {
2253-
config.apiKey = apiKey;
2254-
} else if (accessKeyId && secretAccessKey) {
2255-
config.accessKeyId = accessKeyId;
2256-
config.secretAccessKey = secretAccessKey;
2257-
if (sessionToken) {
2258-
config.sessionToken = sessionToken;
2259-
}
2260-
}
2261-
2262-
// Region is required for AWS credentials but optional for API key
2263-
if (region) {
2264-
config.region = region;
2265-
}
2266-
2267-
// Optional base URL
2268-
if (baseURL) {
2269-
config.baseURL = baseURL;
2270-
}
2271-
2272-
this.provider = createAmazonBedrock(config);
2273-
this.model = modelName || 'anthropic.claude-sonnet-4-6';
2235+
this.provider = createProviderInstance({
2236+
provider: 'bedrock', apiKey, accessKeyId, secretAccessKey, sessionToken, region, baseURL
2237+
});
2238+
this.model = modelName || DEFAULT_MODELS.bedrock;
22742239
this.apiType = 'bedrock';
22752240

22762241
if (this.debug) {
@@ -3012,7 +2977,7 @@ export class ProbeAgent {
30122977

30132978
// Add high-level instructions about when to use tools
30142979
const searchToolDesc1 = this.searchDelegate
3015-
? '- search: Ask natural language questions to find code (e.g., "How does authentication work?"). A subagent handles keyword searches and returns extracted code blocks. Do NOT formulate keyword queries — just ask questions.'
2980+
? '- search: Ask natural language questions to find code locations (e.g., "How does authentication work?"). Returns structured JSON with file locations grouped by relevance. Use extract() on the returned files to read the actual code. Do NOT formulate keyword queries — just ask questions.'
30162981
: '- search: Find code patterns using keyword queries with Elasticsearch syntax. Handles stemming and case variations automatically — do NOT try manual keyword variations.';
30172982
systemPrompt += `You have access to powerful code search and analysis tools through MCP:
30182983
${searchToolDesc1}
@@ -3025,10 +2990,10 @@ ${searchToolDesc1}
30252990
}
30262991

30272992
const searchGuidance1 = this.searchDelegate
3028-
? '1. Start with search — ask a question about what you want to understand. It returns extracted code blocks directly.'
2993+
? '1. Start with search — ask a question about what you want to understand. It returns file locations grouped by relevance (JSON with confidence and groups).'
30292994
: '1. Start with search to find relevant code patterns. One search per concept is usually enough — probe handles stemming and case variations.';
30302995
const extractGuidance1 = this.searchDelegate
3031-
? '2. Use extract only if you need more context or a full file'
2996+
? '2. Use extract on the file locations returned by search to read the actual code. Each group has a "reason" explaining why those files matter.'
30322997
: '2. Use extract to get detailed context when needed';
30332998

30342999
systemPrompt += `\n
@@ -3078,7 +3043,7 @@ ${extractGuidance1}
30783043

30793044
// Add high-level instructions about when to use tools
30803045
const searchToolDesc2 = this.searchDelegate
3081-
? '- search: Ask natural language questions to find code (e.g., "How does authentication work?"). A subagent handles keyword searches and returns extracted code blocks. Do NOT formulate keyword queries — just ask questions.'
3046+
? '- search: Ask natural language questions to find code locations (e.g., "How does authentication work?"). Returns structured JSON with file locations grouped by relevance. Use extract() on the returned files to read the actual code. Do NOT formulate keyword queries — just ask questions.'
30823047
: '- search: Find code patterns using keyword queries with Elasticsearch syntax. Handles stemming and case variations automatically — do NOT try manual keyword variations.';
30833048
systemPrompt += `You have access to powerful code search and analysis tools through MCP:
30843049
${searchToolDesc2}
@@ -3091,10 +3056,10 @@ ${searchToolDesc2}
30913056
}
30923057

30933058
const searchGuidance2 = this.searchDelegate
3094-
? '1. Start with search — ask a question about what you want to understand. It returns extracted code blocks directly.'
3059+
? '1. Start with search — ask a question about what you want to understand. It returns file locations grouped by relevance (JSON with confidence and groups).'
30953060
: '1. Start with search to find relevant code patterns. One search per concept is usually enough — probe handles stemming and case variations.';
30963061
const extractGuidance2 = this.searchDelegate
3097-
? '2. Use extract only if you need more context or a full file'
3062+
? '2. Use extract on the file locations returned by search to read the actual code. Each group has a "reason" explaining why those files matter.'
30983063
: '2. Use extract to get detailed context when needed';
30993064

31003065
systemPrompt += `\n
@@ -3160,10 +3125,10 @@ ${extractGuidance2}
31603125
Follow these instructions carefully:
31613126
1. Analyze the user's request.
31623127
2. Use the available tools step-by-step to fulfill the request.
3163-
3. You MUST use the search tool before answering ANY code-related question. NEVER answer from memory or general knowledge — your answers must be grounded in actual code found via search/extract.${this.searchDelegate ? ' Ask natural language questions — the search subagent handles keyword formulation and returns extracted code blocks. Use extract only to expand context or read full files.' : ' Search handles stemming and case variations automatically — do NOT try keyword variations manually. Read full files only if really necessary.'}
3128+
3. You MUST use the search tool before answering ANY code-related question. NEVER answer from memory or general knowledge — your answers must be grounded in actual code found via search/extract.${this.searchDelegate ? ' Ask natural language questions — the search subagent handles keyword formulation and returns file locations grouped by relevance. Then use extract() on those locations to read the actual code.' : ' Search handles stemming and case variations automatically — do NOT try keyword variations manually. Read full files only if really necessary.'}
31643129
4. Ensure to get really deep and understand the full picture before answering. Follow call chains — if function A calls B, search for B too. Look for related subsystems (e.g., if asked about rate limiting, also check for quota, throttling, smoothing).
31653130
5. Once the task is fully completed, provide your final answer directly as text. Always cite specific files and line numbers as evidence. Do NOT output planning or thinking text — go straight to the answer.
3166-
6. ${this.searchDelegate ? 'Ask clear, specific questions when searching. Each search should target a distinct concept or question.' : 'Prefer concise and focused search queries. Use specific keywords and phrases to narrow down results.'}
3131+
6. ${this.searchDelegate ? 'Ask clear, specific questions when searching. Each search should target a distinct concept or question. NEVER re-search the same concept with different phrasing — if you already searched for "wrapToolWithEmitter", do NOT search again for "definition of wrapToolWithEmitter" or "how wrapToolWithEmitter works". Use extract() on the files already found instead. Limit yourself to one search per distinct concept. When formulating queries, describe WHAT you are looking for, not WHERE — the search agent will search the full codebase. Do NOT include file names or class names in the query unless that IS the concept (e.g., say "search dedup logic" not "search dedup ProbeAgent").' : 'Prefer concise and focused search queries. Use specific keywords and phrases to narrow down results.'}
31673132
7. NEVER use bash for code exploration (no grep, cat, find, head, tail, awk, sed) — always use search and extract tools instead. Bash is only for system operations like building, running tests, or git commands.${this.allowEdit ? `
31683133
7. When modifying files, choose the appropriate tool:
31693134
- Use 'edit' for all code modifications:
@@ -4088,9 +4053,16 @@ or
40884053
const searchSummary = searchesTried.length > 0
40894054
? `\nSearches attempted: ${searchesTried.join(', ')}`
40904055
: '';
4056+
4057+
// For code-searcher subagents: instruct to output structured JSON even on partial results
4058+
const isCodeSearcher = this.promptType === 'code-searcher';
4059+
const lastIterMessage = isCodeSearcher
4060+
? `⚠️ LAST ITERATION — you are out of tool calls. Output your JSON response NOW with whatever files you have verified so far. Set confidence to "low" if your search was incomplete. Include the "searches" array listing all search queries you made with their paths and outcomes.${searchSummary}`
4061+
: `⚠️ LAST ITERATION — you are out of tool calls. Provide your BEST answer NOW with the information gathered so far. If you could not find what was requested, explain exactly what you searched for and why it did not work, so the caller can try a different approach.${searchSummary}`;
4062+
40914063
return {
40924064
toolChoice: 'none',
4093-
userMessage: `⚠️ LAST ITERATION — you are out of tool calls. Provide your BEST answer NOW with the information gathered so far. If you could not find what was requested, explain exactly what you searched for and why it did not work, so the caller can try a different approach.${searchSummary}`
4065+
userMessage: lastIterMessage
40944066
};
40954067
}
40964068

@@ -4766,27 +4738,41 @@ Double-check your response based on the criteria above. If everything looks good
47664738
if (!finalResult || finalResult === DEFAULT_MAX_ITER_MSG) {
47674739
try {
47684740
const searchQueries = [];
4741+
const searchDetails = [];
47694742
const toolCounts = {};
47704743
for (const tc of _toolCallLog) {
47714744
toolCounts[tc.name] = (toolCounts[tc.name] || 0) + 1;
47724745
if (tc.name === 'search') {
47734746
const q = tc.args.query || '';
4747+
const p = tc.args.path || '.';
47744748
const exact = tc.args.exact ? ' (exact)' : '';
47754749
searchQueries.push(`"${q}"${exact}`);
4750+
searchDetails.push({ query: q, path: p, had_results: false });
47764751
}
47774752
}
47784753
const toolBreakdown = Object.entries(toolCounts)
47794754
.map(([name, count]) => `${name}: ${count}x`)
47804755
.join(', ');
47814756
const uniqueSearches = [...new Set(searchQueries)];
47824757

4783-
let summary = `I was unable to complete your request after ${currentIteration} tool iterations.\n\n`;
4784-
summary += `Tool calls made: ${toolBreakdown || 'none'}\n`;
4785-
if (uniqueSearches.length > 0) {
4786-
summary += `Search queries tried: ${uniqueSearches.join(', ')}\n`;
4758+
// For code-searcher subagents: produce structured JSON so the parent
4759+
// can still use partial results instead of getting a plain error string.
4760+
if (this.promptType === 'code-searcher') {
4761+
finalResult = JSON.stringify({
4762+
confidence: 'low',
4763+
reason: 'Search incomplete — iteration limit reached',
4764+
groups: [],
4765+
searches: searchDetails
4766+
});
4767+
} else {
4768+
let summary = `I was unable to complete your request after ${currentIteration} tool iterations.\n\n`;
4769+
summary += `Tool calls made: ${toolBreakdown || 'none'}\n`;
4770+
if (uniqueSearches.length > 0) {
4771+
summary += `Search queries tried: ${uniqueSearches.join(', ')}\n`;
4772+
}
4773+
summary += `\nThe search approach may be fundamentally wrong for this query. Consider: using exact=true for literal string matching, using bash/grep for pattern-based file searches, or trying a completely different strategy instead of repeating similar searches.`;
4774+
finalResult = summary;
47874775
}
4788-
summary += `\nThe search approach may be fundamentally wrong for this query. Consider: using exact=true for literal string matching, using bash/grep for pattern-based file searches, or trying a completely different strategy instead of repeating similar searches.`;
4789-
finalResult = summary;
47904776
} catch {
47914777
finalResult = DEFAULT_MAX_ITER_MSG;
47924778
}

npm/src/delegate.js

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -659,10 +659,12 @@ export async function delegate({
659659
});
660660

661661
if (delegationSpan) {
662+
const { truncateForSpan } = await import('./agent/simpleTelemetry.js');
662663
delegationSpan.setAttributes({
663664
'delegation.result.success': true,
664665
'delegation.result.response_length': response.length,
665-
'delegation.result.duration_ms': duration
666+
'delegation.result.duration_ms': duration,
667+
'delegation.result': truncateForSpan(response, 4096)
666668
});
667669
delegationSpan.setStatus({ code: 1 }); // OK
668670
delegationSpan.end();

0 commit comments

Comments
 (0)