Feat/deep research#11
Draft
Marujah wants to merge 4 commits into
Draft
Conversation
peeeteeer
pushed a commit
that referenced
this pull request
Apr 9, 2026
* ✨ feat: Add Config schema, model, and methods for role-based DB config overrides
Add the database foundation for principal-based configuration overrides
(user, group, role) in data-schemas. Includes schema with tenantId and
tenant isolation, CRUD methods, and barrel exports.
* 🔧 fix: Add shebang and enforce LF line endings for git hooks
The pre-commit hook was missing #!/bin/sh, and core.autocrlf=true was
converting it to CRLF, both causing "Exec format error" on Windows.
Add .gitattributes to force LF for .husky/* and *.sh files.
* ✨ feat: Add admin config API routes with section-level capability checks
Add /api/admin/config endpoints for managing per-principal config
overrides (user, group, role). Handlers in @librechat/api use DI pattern
with section-level hasConfigCapability checks for granular access control.
Supports full overrides replacement, per-field PATCH via dot-paths, field
deletion, toggle active, and listing.
* 🐛 fix: Move deleteConfigField fieldPath from URL param to request body
The path-to-regexp wildcard syntax (:fieldPath(*)) is not supported by
the version used in Express. Send fieldPath in the DELETE request body
instead, which also avoids URL-encoding issues with dotted paths.
* ✨ feat: Wire config resolution into getAppConfig with override caching
Add mergeConfigOverrides utility in data-schemas for deep-merging DB
config overrides into base AppConfig by priority order.
Update getAppConfig to query DB for applicable configs when role/userId
is provided, with short-TTL caching and a hasAnyConfigs feature flag
for zero-cost when no DB configs exist.
Also: add unique compound index on Config schema, pass userId from
config middleware, and signal config changes from admin API handlers.
* 🔄 refactor: Extract getAppConfig logic into packages/api as TS service
Move override resolution, caching strategy, and signalConfigChange from
api/server/services/Config/app.js into packages/api/src/app/appConfigService.ts
using the DI factory pattern (createAppConfigService). The JS file becomes
a thin wiring layer injecting loadBaseConfig, cache, and DB dependencies.
* 🧹 chore: Rename configResolution.ts to resolution.ts
* ✨ feat: Move admin types & capabilities to librechat-data-provider
Move SystemCapabilities, CapabilityImplications, and utility functions
(hasImpliedCapability, expandImplications) from data-schemas to
data-provider so they are available to external consumers like the
admin panel without a data-schemas dependency.
Add API-friendly admin types: TAdminConfig, TAdminSystemGrant,
TAdminAuditLogEntry, TAdminGroup, TAdminMember, TAdminUserSearchResult,
TCapabilityCategory, and CAPABILITY_CATEGORIES.
data-schemas re-exports these from data-provider and extends with
config-schema-derived types (ConfigSection, SystemCapability union).
Bump version to 0.8.500.
* feat: Add JSON-serializable admin config API response types to data-schemas
Add AdminConfig, AdminConfigListResponse, AdminConfigResponse, and
AdminConfigDeleteResponse types so both LibreChat API handlers and the
admin panel can share the same response contract. Bump version to 0.0.41.
* refactor: Move admin capabilities & types from data-provider to data-schemas
SystemCapabilities, CapabilityImplications, utility functions,
CAPABILITY_CATEGORIES, and admin API response types should not be in
data-provider as it gets compiled into the frontend bundle, exposing
the capability surface. Moved everything to data-schemas (server-only).
All consumers already import from @librechat/data-schemas, so no
import changes needed elsewhere. Consolidated duplicate AdminConfig
type (was in both config.ts and admin.ts).
* chore: Bump @librechat/data-schemas to 0.0.42
* refactor: Reorganize admin capabilities into admin/ and types/admin.ts
Split systemCapabilities.ts following data-schemas conventions:
- Types (BaseSystemCapability, SystemCapability, AdminConfig, etc.)
→ src/types/admin.ts
- Runtime code (SystemCapabilities, CapabilityImplications, utilities)
→ src/admin/capabilities.ts
Revert data-provider version to 0.8.401 (no longer modified).
* chore: Fix import ordering, rename appConfigService to service
- Rename app/appConfigService.ts → app/service.ts (directory provides context)
- Fix import order in admin/config.ts, types/admin.ts, types/config.ts
- Add naming convention to AGENTS.md
* feat: Add DB base config support (role/__base__)
- Add BASE_CONFIG_PRINCIPAL_ID constant for reserved base config doc
- getApplicableConfigs always includes __base__ in queries
- getAppConfig queries DB even without role/userId when DB configs exist
- Bump @librechat/data-schemas to 0.0.43
* fix: Address PR review issues for admin config
- Add listAllConfigs method; listConfigs endpoint returns all active
configs instead of only __base__
- Normalize principalId to string in all config methods to prevent
ObjectId vs string mismatch on user/group lookups
- Block __proto__ and all dunder-prefixed segments in field path
validation to prevent prototype pollution
- Fix configVersion off-by-one: default to 0, guard pre('save') with
!isNew, use $inc on findOneAndUpdate
- Remove unused getApplicableConfigs from admin handler deps
* fix: Enable tree-shaking for data-schemas, bump packages
- Switch data-schemas Rollup output to preserveModules so each source
file becomes its own chunk; consumers (admin panel) can now import
just the modules they need without pulling in winston/mongoose/etc.
- Add sideEffects: false to data-schemas package.json
- Bump data-schemas to 0.0.44, data-provider to 0.8.402
* feat: add capabilities subpath export to data-schemas
Adds `@librechat/data-schemas/capabilities` subpath export so browser
consumers can import BASE_CONFIG_PRINCIPAL_ID and capability constants
without pulling in Node.js-only modules (winston, async_hooks, etc.).
Bump version to 0.0.45.
* fix: include dist/ in data-provider npm package
Add explicit files field so npm includes dist/types/ in the published
package. Without this, the root .gitignore exclusion of dist/ causes
npm to omit type declarations, breaking TypeScript consumers.
* chore: bump librechat-data-provider to 0.8.403
* feat: add GET /api/admin/config/base for raw AppConfig
Returns the full AppConfig (YAML + DB base merged) so the admin panel
can display actual config field values and structure. The startup config
endpoint (/api/config) returns TStartupConfig which is a different shape
meant for the frontend app.
* chore: imports order
* fix: address code review findings for admin config
Critical:
- Fix clearAppConfigCache: was deleting from wrong cache store (CONFIG_STORE
instead of APP_CONFIG), now clears BASE and HAS_DB_CONFIGS keys
- Eliminate race condition: patchConfigField and deleteConfigField now use
atomic MongoDB $set/$unset with dot-path notation instead of
read-modify-write cycles, removing the lost-update bug entirely
- Add patchConfigFields and unsetConfigField atomic DB methods
Major:
- Reorder cache check before principal resolution in getAppConfig so
getUserPrincipals DB query only fires on cache miss
- Replace '' as ConfigSection with typed BROAD_CONFIG_ACCESS constant
- Parallelize capability checks with Promise.all instead of sequential
awaits in for loops
- Use loose equality (== null) for cache miss check to handle both null
and undefined returns from cache implementations
- Set HAS_DB_CONFIGS_KEY to true on successful config fetch
Minor:
- Remove dead pre('save') hook from config schema (all writes use
findOneAndUpdate which bypasses document hooks)
- Consolidate duplicate type imports in resolution.ts
- Remove dead deepGet/deepSet/deepUnset functions (replaced by atomic ops)
- Add .sort({ priority: 1 }) to getApplicableConfigs query
- Rename _impliedBy to impliedByMap
* fix: self-referencing BROAD_CONFIG_ACCESS constant
* fix: replace type-cast sentinel with proper null parameter
Update hasConfigCapability to accept ConfigSection | null where null
means broad access check (MANAGE_CONFIGS or READ_CONFIGS only).
Removes the '' as ConfigSection type lie from admin config handlers.
* fix: remaining review findings + add tests
- listAllConfigs accepts optional { isActive } filter so admin listing
can show inactive configs (#9)
- Standardize session application to .session(session ?? null) across
all config DB methods (danny-avila#15)
- Export isValidFieldPath and getTopLevelSection for testability
- Add 38 tests across 3 spec files:
- config.spec.ts (api): path validation, prototype pollution rejection
- resolution.spec.ts: deep merge, priority ordering, array replacement
- config.spec.ts (data-schemas): full CRUD, ObjectId normalization,
atomic $set/$unset, configVersion increment, toggle, __base__ query
* fix: address second code review findings
- Fix cross-user cache contamination: overrideCacheKey now handles
userId-without-role case with its own cache key (#1)
- Add broad capability check before DB lookup in getConfig to prevent
config existence enumeration (#2/#3)
- Move deleteConfigField fieldPath from request body to query parameter
for proxy/load balancer compatibility (#5)
- Derive BaseSystemCapability from SystemCapabilities const instead of
manual string union (#6)
- Return 201 on upsert creation, 200 on update (#11)
- Remove inline narration comments per AGENTS.md (danny-avila#12)
- Type overrides as Partial<TCustomConfig> in DB methods and handler
deps (danny-avila#13)
- Replace double as-unknown-as casts in resolution.ts with generic
deepMerge<T> (danny-avila#14)
- Make override cache TTL injectable via AppConfigServiceDeps (danny-avila#16)
- Add exhaustive never check in principalModel switch (danny-avila#17)
* fix: remaining review findings — tests, rename, semantics
- Rename signalConfigChange → markConfigsDirty with JSDoc documenting
the stale-window tradeoff and overrideCacheTtl knob
- Fix DEFAULT_OVERRIDE_CACHE_TTL naming convention
- Add createAppConfigService tests (14 cases): cache behavior, feature
flag, cross-user key isolation, fallback on error, markConfigsDirty
- Add admin handler integration tests (13 cases): auth ordering,
201/200 on create/update, fieldPath from query param, markConfigsDirty
calls, capability checks
* fix: global flag corruption + empty overrides auth bypass
- Remove HAS_DB_CONFIGS_KEY=false optimization: a scoped query returning
no configs does not mean no configs exist globally. Setting the flag
false from a per-principal query short-circuited all subsequent users.
- Add broad manage capability check before section checks in
upsertConfigOverrides: empty overrides {} no longer bypasses auth.
* test: add regression and invariant tests for config system
Regression tests:
- Bug 1: User A's empty result does not short-circuit User B's overrides
- Bug 2: Empty overrides {} returns 403 without MANAGE_CONFIGS
Invariant tests (applied across ALL handlers):
- All 5 mutation handlers call markConfigsDirty on success
- All 5 mutation handlers return 401 without auth
- All 5 mutation handlers return 403 without capability
- All 3 read handlers return 403 without capability
* fix: third review pass — all findings addressed
Service (service.ts):
- Restore HAS_DB_CONFIGS=false for base-only queries (no role/userId)
so deployments with zero DB configs skip DB queries (#1)
- Resolve cache once at factory init instead of per-invocation (#8)
- Use BASE_CONFIG_PRINCIPAL_ID constant in overrideCacheKey (#10)
- Add JSDoc to clearAppConfigCache documenting stale-window (#4)
- Fix log message to not say "from YAML" (danny-avila#14)
Admin handlers (config.ts):
- Use configVersion===1 for 201 vs 200, eliminating TOCTOU race (#2)
- Add Array.isArray guard on overrides body (#5)
- Import CapabilityUser from capabilities.ts, remove duplicate (#6)
- Replace as-unknown-as cast with targeted type assertion (#7)
- Add MAX_PATCH_ENTRIES=100 cap on entries array (danny-avila#15)
- Reorder deleteConfigField to validate principalType first (danny-avila#12)
- Export CapabilityUser from middleware/capabilities.ts
DB methods (config.ts):
- Remove isActive:true from patchConfigFields to prevent silent
reactivation of disabled configs (#3)
Schema (config.ts):
- Change principalId from Schema.Types.Mixed to String (#11)
Tests:
- Add patchConfigField unsafe fieldPath rejection test (#9)
- Add base-only HAS_DB_CONFIGS=false test (#1)
- Update 201/200 tests to use configVersion instead of findConfig (#2)
* fix: add read handler 401 invariant tests + document flag behavior
- Add invariant: all 3 read handlers return 401 without auth
- Document on markConfigsDirty that HAS_DB_CONFIGS stays true after
all configs are deleted until clearAppConfigCache or restart
* fix: remove HAS_DB_CONFIGS false optimization entirely
getApplicableConfigs([]) only queries for __base__, not all configs.
A deployment with role/group configs but no __base__ doc gets the
flag poisoned to false by a base-only query, silently ignoring all
scoped overrides. The optimization is not safe without a comprehensive
Config.exists() check, which adds its own DB cost. Removed entirely.
The flag is now write-once-true (set when configs are found or by
markConfigsDirty) and only cleared by clearAppConfigCache/restart.
* chore: reorder import statements in app.js for clarity
* refactor: remove HAS_DB_CONFIGS_KEY machinery entirely
The three-state flag (false/null/true) was the source of multiple bugs
across review rounds. Every attempt to safely set it to false was
defeated by getApplicableConfigs querying only a subset of principals.
Removed: HAS_DB_CONFIGS_KEY constant, all reads/writes of the flag,
markConfigsDirty (now a no-op concept), notifyChange wrapper, and all
tests that seeded false manually.
The per-user/role TTL cache (overrideCacheTtl, default 60s) is the
sole caching mechanism. On cache miss, getApplicableConfigs queries
the DB. This is one indexed query per user per TTL window — acceptable
for the config override use case.
* docs: rewrite admin panel remaining work with current state
* perf: cache empty override results to avoid repeated DB queries
When getApplicableConfigs returns no configs for a principal, cache
baseConfig under their override key with TTL. Without this, every
user with no per-principal overrides hits MongoDB on every request
after the 60s cache window expires.
* fix: add tenantId to cache keys + reject PUBLIC principal type
- Include tenantId in override cache keys to prevent cross-tenant
config contamination. Single-tenant deployments (tenantId undefined)
use '_' as placeholder — no behavior change for them.
- Reject PrincipalType.PUBLIC in admin config validation — PUBLIC has
no PrincipalModel and is never resolved by getApplicableConfigs,
so config docs for it would be dead data.
- Config middleware passes req.user.tenantId to getAppConfig.
* fix: fourth review pass findings
DB methods (config.ts):
- findConfigByPrincipal accepts { includeInactive } option so admin
GET can retrieve inactive configs (#5)
- upsertConfig catches E11000 duplicate key on concurrent upserts and
retries without upsert flag (#2)
- unsetConfigField no longer filters isActive:true, consistent with
patchConfigFields (#11)
- Typed filter objects replace Record<string, unknown> (danny-avila#12)
Admin handlers (config.ts):
- patchConfigField: serial broad capability check before Promise.all
to pre-warm ALS principal cache, preventing N parallel DB calls (#3)
- isValidFieldPath rejects leading/trailing dots and consecutive
dots (#7)
- Duplicate fieldPaths in patch entries return 400 (#8)
- DEFAULT_PRIORITY named constant replaces hardcoded 10 (danny-avila#14)
- Admin getConfig and patchConfigField pass includeInactive to
findConfigByPrincipal (#5)
- Route import uses barrel instead of direct file path (danny-avila#13)
Resolution (resolution.ts):
- deepMerge has MAX_MERGE_DEPTH=10 guard to prevent stack overflow
from crafted deeply nested configs (#4)
* fix: final review cleanup
- Remove ADMIN_PANEL_REMAINING.md (local dev notes with Windows paths)
- Add empty-result caching regression test
- Add tenantId to AdminConfigDeps.getAppConfig type
- Restore exhaustive never check in principalModel switch
- Standardize toggleConfigActive session handling to options pattern
* fix: validate priority in patchConfigField handler
Add the same non-negative number validation for priority that
upsertConfigOverrides already has. Without this, invalid priority
values could be stored via PATCH and corrupt merge ordering.
* chore: remove planning doc from PR
* fix: correct stale cache key strings in service tests
* fix: clean up service tests and harden tenant sentinel
- Remove no-op cache delete lines from regression tests
- Change no-tenant sentinel from '_' to '__default__' to avoid
collision with a real tenant ID when multi-tenancy is enabled
- Remove unused CONFIG_STORE from AppConfigServiceDeps
* chore: bump @librechat/data-schemas to 0.0.46
* fix: block prototype-poisoning keys in deepMerge
Skip __proto__, constructor, and prototype keys during config merge
to prevent prototype pollution via PUT /api/admin/config overrides.
peeeteeer
pushed a commit
that referenced
this pull request
Apr 9, 2026
…anny-avila#12435) * feat: add MCPServerSource type, tenantMcpPolicy schema, and source-based dbSourced wiring - Add `tenantMcpPolicy` to `mcpSettings` in YAML config schema with `enabled`, `maxServersPerTenant`, `allowedTransports`, and `allowedDomains` - Add `MCPServerSource` type ('yaml' | 'config' | 'user') and `source` field to `ParsedServerConfig` - Change `dbSourced` determination from `!!config.dbId` to `config.source === 'user'` across MCPManager, ConnectionsRepository, UserConnectionManager, and MCPServerInspector - Set `source: 'user'` on all DB-sourced servers in ServerConfigsDB * feat: three-layer MCPServersRegistry with config cache and lazy init - Add `configCacheRepo` as third repository layer between YAML cache and DB for admin-defined config-source MCP servers - Implement `ensureConfigServers()` that identifies config-override servers from resolved `getAppConfig()` mcpConfig, lazily inspects them, and caches parsed configs with `source: 'config'` - Add `lazyInitConfigServer()` with timeout, stub-on-failure, and concurrent-init deduplication via `pendingConfigInits` map - Extend `getAllServerConfigs()` with optional `configServers` param for three-way merge: YAML → Config → User - Add `getServerConfig()` lookup through config cache layer - Add `invalidateConfigCache()` for clearing config-source inspection results on admin config mutations - Tag `source: 'yaml'` on CACHE-stored servers and `source: 'user'` on DB-stored servers in `addServer()` and `addServerStub()` * feat: wire tenant context into MCP controllers, services, and cache invalidation - Resolve config-source servers via `getAppConfig({ role, tenantId })` in `getMCPTools()` and `getMCPServersList()` controllers - Pass `ensureConfigServers()` results through `getAllServerConfigs()` for three-way merge of YAML + Config + User servers - Add tenant/role context to `getMCPSetupData()` and connection status routes via `getTenantId()` from ALS - Add `clearMcpConfigCache()` to `invalidateConfigCaches()` so admin config mutations trigger re-inspection of config-source MCP servers * feat: enforce tenantMcpPolicy on admin config mcpServers mutations - Add `validateMcpServerPolicy()` helper that checks mcpServers against operator-defined `tenantMcpPolicy` (enabled, maxServersPerTenant, allowedTransports, allowedDomains) - Wire validation into `upsertConfigOverrides` and `patchConfigField` handlers — rejects with 403 when policy is violated - Infer transport type from config shape (command → stdio, url protocol → websocket/sse, type field → streamable-http) - Validate server domains against policy allowlist when configured * revert: remove tenantMcpPolicy schema and enforcement The existing admin config CRUD routes already provide the mechanism for granular MCP server prepopulation (groups, roles, users). The tenantMcpPolicy gating adds unnecessary complexity that can be revisited if needed in the future. - Remove tenantMcpPolicy from mcpSettings Zod schema - Remove validateMcpServerPolicy helper and TenantMcpPolicy interface - Remove policy enforcement from upsertConfigOverrides and patchConfigField handlers * test: update test assertions for source field and config-server wiring - Use objectContaining in MCPServersRegistry reset test to account for new source: 'yaml' field on CACHE-stored configs - Add getTenantId and ensureConfigServers mocks to MCP route tests - Add getAppConfig mock to route test Config service mock - Update getMCPSetupData assertion to expect second options argument - Update getAllServerConfigs assertions for new configServers parameter * fix: disconnect active connections when config-source servers are evicted When admin config overrides change and config-source MCP servers are removed, the invalidation now proactively disconnects active connections for evicted servers instead of leaving them lingering until timeout. - Return evicted server names from invalidateConfigCache() - Disconnect app-level connections for evicted servers in clearMcpConfigCache() via MCPManager.appConnections.disconnect() * fix: address code review findings (CRITICAL, MAJOR, MINOR) CRITICAL fixes: - Scope configCacheRepo keys by config content hash to prevent cross-tenant cache poisoning when two tenants define the same server name with different configurations - Change dbSourced checks from `source === 'user'` to `source !== 'yaml' && source !== 'config'` so undefined source (pre-upgrade cached configs) fails closed to restricted mode MAJOR fixes: - Derive OAuth servers from already-computed mcpConfig instead of calling getOAuthServers() separately — config-source OAuth servers are now properly detected - Add parseInt radix (10) and NaN guard with fallback to 30_000 for CONFIG_SERVER_INIT_TIMEOUT_MS - Add CONFIG_CACHE_NAMESPACE to aggregate-key branch in ServerConfigsCacheFactory to avoid SCAN-based Redis stalls - Remove `if (role || tenantId)` guard in getMCPSetupData — config servers now always resolve regardless of tenant context MINOR fixes: - Extract resolveAllMcpConfigs() helper in mcp controller to eliminate 3x copy-pasted config resolution boilerplate - Distinguish "not initialized" from real errors in clearMcpConfigCache — log actual failures instead of swallowing - Remove narrative inline comments per style guide - Remove dead try/catch inside Promise.allSettled in ensureConfigServers (inner method never throws) - Memoize YAML server names to avoid repeated cacheConfigsRepo.getAll() calls per request Test updates: - Add ensureConfigServers mock to registry test fixtures - Update getMCPSetupData assertions for inline OAuth derivation * fix: address code review findings (CRITICAL, MAJOR, MINOR) CRITICAL fixes: - Break circular dependency: move CONFIG_CACHE_NAMESPACE from MCPServersRegistry to ServerConfigsCacheFactory - Fix dbSourced fail-closed: use source field when present, fall back to legacy dbId check when absent (backward-compatible with pre-upgrade cached configs that lack source field) MAJOR fixes: - Add CONFIG_CACHE_NAMESPACE to aggregate-key set in ServerConfigsCacheFactory to avoid SCAN-based Redis stalls - Add comprehensive test suite (ensureConfigServers.test.ts, 18 tests) covering lazy init, stub-on-failure, cross-tenant isolation via config hash keys, concurrent deduplication, merge order, and cache invalidation MINOR fixes: - Update MCPServerInspector test assertion for dbSourced change * fix: restore getServerConfig lookup for config-source servers (NEW-1) Add configNameToKey map that indexes server name → hash-based cache key for O(1) lookup by name in getServerConfig. This restores the config cache layer that was dropped when hash-based keys were introduced. Without this fix, config-source servers appeared in tool listings (via getAllServerConfigs) but getServerConfig returned undefined, breaking all connection and tool call paths. - Populate configNameToKey in ensureSingleConfigServer - Clear configNameToKey in invalidateConfigCache and reset - Clear stale read-through cache entries after lazy init - Remove dead code in invalidateConfigCache (config.title, key parsing) - Add getServerConfig tests for config-source server lookup * fix: eliminate configNameToKey race via caller-provided configServers param Replace the process-global configNameToKey map (last-writer-wins under concurrent multi-tenant load) with a configServers parameter on getServerConfig. Callers pass the pre-resolved config servers map directly — no shared mutable state, no cross-tenant race. - Add optional configServers param to getServerConfig; when provided, returns matching config directly without any global lookup - Remove configNameToKey map entirely (was the source of the race) - Extract server names from cache keys via lastIndexOf in invalidateConfigCache (safe for names containing colons) - Use mcpConfig[serverName] directly in getMCPTools instead of a redundant getServerConfig call - Add cross-tenant isolation test for getServerConfig * fix: populate read-through cache after config server lazy init After lazyInitConfigServer succeeds, write the parsed config to readThroughCache keyed by serverName so that getServerConfig calls from ConnectionsRepository, UserConnectionManager, and MCPManager.callTool find the config without needing configServers. Without this, config-source servers appeared in tool listings but every connection attempt and tool call returned undefined. * fix: user-scoped getServerConfig fallback to server-only cache key When getServerConfig is called with a userId (e.g., from callTool or UserConnectionManager), the cache key is serverName::userId. Config-source servers are cached under the server-only key (no userId). Add a fallback so user-scoped lookups find config-source servers in the read-through cache. * fix: configCacheRepo fallback, isUserSourced DRY, cross-process race CRITICAL: Add findInConfigCache fallback in getServerConfig so config-source servers remain reachable after readThroughCache TTL expires (5s). Without this, every tool call after 5s returned undefined for config-source servers. MAJOR: Extract isUserSourced() helper to mcp/utils.ts and replace all 5 inline dbSourced ternary expressions (MCPManager x2, ConnectionsRepository, UserConnectionManager, MCPServerInspector). MAJOR: Fix cross-process Redis race in lazyInitConfigServer — when configCacheRepo.add throws (key exists from another process), fall back to reading the existing entry instead of returning undefined. MINOR: Parallelize invalidateConfigCache awaits with Promise.all. Remove redundant .catch(() => {}) inside Promise.allSettled. Tighten dedup test assertion to toBe(1). Add TTL-expiry tests for getServerConfig (with and without userId). * feat: thread configServers through getAppToolFunctions and formatInstructionsForContext Add optional configServers parameter to getAppToolFunctions, getInstructions, and formatInstructionsForContext so config-source server tools and instructions are visible to agent initialization and context injection paths. Existing callers (boot-time init, tests) pass no argument and continue to work unchanged. Agent runtime paths can now thread resolved config servers from request context. * fix: stale failure stubs retry after 5 min, upsert for cross-process races - Add CONFIG_STUB_RETRY_MS (5 min) — stale failure stubs are retried instead of permanently disabling config-source servers after transient errors (DNS outage, cold-start race) - Extract upsertConfigCache() helper that tries add then falls back to update, preventing cross-process Redis races where a second instance's successful inspection result was discarded - Add test for stale-stub retry after CONFIG_STUB_RETRY_MS * fix: stamp updatedAt on failure stubs, null-guard callTool config, test cleanup - Add updatedAt: Date.now() to failure stubs in lazyInitConfigServer so CONFIG_STUB_RETRY_MS (5 min) window works correctly — without it, stubs were always considered stale (updatedAt ?? 0 → epoch → always expired) - Add null guard for rawConfig in MCPManager.callTool before passing to preProcessGraphTokens — prevents unsafe `as` cast on undefined - Log double-failure in upsertConfigCache instead of silently swallowing - Replace module-scope Date.now monkey-patch with jest.useFakeTimers / jest.setSystemTime / jest.useRealTimers in ensureConfigServers tests * fix: server-only readThrough fallback only returns truthy values Prevents a cached undefined from a prior no-userId lookup from short-circuiting the DB query on a subsequent userId-scoped lookup. * fix: remove findInConfigCache to eliminate cross-tenant config leakage The findInConfigCache prefix scan (serverName:*) could return any tenant's config after readThrough TTL expires, violating tenant isolation. Config-source servers are now ONLY resolvable through: 1. The configServers param (callers with tenant context from ALS) 2. The readThrough cache (populated by ensureSingleConfigServer, 5s TTL, repopulated on every HTTP request via resolveAllMcpConfigs) Connection/tool-call paths without tenant context rely exclusively on the readThrough cache. If it expires before the next HTTP request repopulates it, the server is not found — which is correct because there is no tenant context to determine which config to return. - Remove findInConfigCache method and its call in getServerConfig - Update server-only readThrough fallback to only return truthy values (prevents cached undefined from short-circuiting user-scoped DB lookup) - Update tests to document tenant isolation behavior after cache expiry * style: fix import order per AGENTS.md conventions Sort package imports shortest-to-longest, local imports longest-to-shortest across MCPServersRegistry, ConnectionsRepository, MCPManager, UserConnectionManager, and MCPServerInspector. * fix: eliminate cross-tenant readThrough contamination and TTL-expiry tool failures Thread pre-resolved serverConfig from tool creation context into callTool, removing dependency on the readThrough cache for config-source servers. This fixes two issues: - Cross-tenant contamination: the readThrough cache key was unscoped (just serverName), so concurrent multi-tenant requests for same-named servers would overwrite each other's entries - TTL expiry: tool calls happening >5s after config resolution would fail with "Configuration not found" because the readThrough entry had expired Changes: - Add optional serverConfig param to MCPManager.callTool — uses provided config directly, falling back to getServerConfig lookup for YAML/user servers - Thread serverConfig from createMCPTool through createToolInstance closure to callTool - Remove readThrough write from ensureSingleConfigServer — config-source servers are only accessible via configServers param (tenant-scoped) - Remove server-only readThrough fallback from getServerConfig - Increase config cache hash from 8 to 16 hex chars (64-bit) - Add isUserSourced boundary tests for all source/dbId combinations - Fix double Object.keys call in getMCPTools controller - Update test assertions for new getServerConfig behavior * fix: cache base configs for config-server users; narrow upsertConfigCache error handling - Refactor getAllServerConfigs to separate base config fetch (YAML + DB) from config-server layering. Base configs are cached via readThroughCacheAll regardless of whether configServers is provided, eliminating uncached MongoDB queries per request for config-server users - Narrow upsertConfigCache catch to duplicate-key errors only; infrastructure errors (Redis timeouts, network failures) now propagate instead of being silently swallowed, preventing inspection storms during outages * fix: restore correct merge order and document upsert error matching - Restore YAML → Config → User DB precedence in getAllServerConfigs (user DB servers have highest precedence, matching the JSDoc contract) - Add source comment on upsertConfigCache duplicate-key detection linking to the two cache implementations that define the error message * feat: complete config-source server support across all execution paths Wire configServers through the entire agent execution pipeline so config-source MCP servers are fully functional — not just visible in listings but executable in agent sessions. - Thread configServers into handleTools.js agent tool pipeline: resolve config servers from tenant context before MCP tool iteration, pass to getServerConfig, createMCPTools, and createMCPTool - Thread configServers into agent instructions pipeline: applyContextToAgent → getMCPInstructionsForServers → formatInstructionsForContext, resolved in client.js before agent context application - Add configServers param to createMCPTool and createMCPTools for reconnect path fallback - Add source field to redactServerSecrets allowlist for client UI differentiation of server tiers - Narrow invalidateConfigCache to only clear readThroughCacheAll (merged results), preserving YAML individual-server readThrough entries - Update context.spec.ts assertions for new configServers parameter * fix: add missing mocks for config-source server dependencies in client.test.js Mock getMCPServersRegistry, getAppConfig, and getTenantId that were added to client.js but not reflected in the test file's jest.mock declarations. * fix: update formatInstructionsForContext assertions for configServers param The test assertions expected formatInstructionsForContext to be called with only the server names array, but it now receives configServers as a second argument after the config-source server feature wiring. * fix: move configServers resolution before MCP tool loop to avoid TDZ configServers was declared with `let` after the first tool loop but referenced inside it via getServerConfig(), causing a ReferenceError temporal dead zone. Move declaration and resolution before the loop, using tools.some(mcpToolPattern) to gate the async resolution. * fix: address review findings — cache bypass, discoverServerTools gap, DRY - #2: getAllServerConfigs now always uses getBaseServerConfigs (cached via readThroughCacheAll) instead of bypassing it when configServers is present. Extracts user-DB entries from cached base by diffing against YAML keys to maintain YAML → Config → User DB merge order without extra MongoDB calls. - #3: Add configServers param to ToolDiscoveryOptions and thread it through discoverServerTools → getServerConfig so config-source servers are discoverable during OAuth reconnection flows. - #6: Replace inline import() type annotations in context.ts with proper import type { ParsedServerConfig } per AGENTS.md conventions. - #7: Extract resolveConfigServers(req) helper in MCP.js and use it from handleTools.js and client.js, eliminating the duplicated 6-line config resolution pattern. - #10: Restore removed "why" comment explaining getLoaded() vs getAll() choice in getMCPSetupData — documents non-obvious correctness constraint. - #11: Fix incomplete JSDoc param type on resolveAllMcpConfigs. * fix: consolidate imports, reorder constants, fix YAML-DB merge edge case - Merge duplicate @librechat/data-schemas requires in MCP.js into one - Move resolveConfigServers after module-level constants - Fix getAllServerConfigs edge case where user-DB entry overriding a YAML entry with the same name was excluded from userDbConfigs; now uses reference equality check to detect DB-overwritten YAML keys * fix: replace fragile string-match error detection with proper upsert method Add upsert() to IServerConfigsRepositoryInterface and all implementations (InMemory, Redis, RedisAggregateKey, DB). This eliminates the brittle error message string match ('already exists in cache') in upsertConfigCache that was the only thing preventing cross-process init races from silently discarding inspection results. Each implementation handles add-or-update atomically: - InMemory: direct Map.set() - Redis: direct cache.set() - RedisAggregateKey: read-modify-write under write lock - DB: delegates to update() (DB servers use explicit add() with ACL setup) * fix: wire configServers through remaining HTTP endpoints - getMCPServerById: use resolveAllMcpConfigs instead of bare getServerConfig - reinitialize route: resolve configServers before getServerConfig - auth-values route: resolve configServers before getServerConfig - getOAuthHeaders: accept configServers param, thread from callers - Update mcp.spec.js tests to mock getAllServerConfigs for GET by name * fix: thread serverConfig through getConnection for config-source servers Config-source servers exist only in configCacheRepo, not in YAML cache or DB. When callTool → getConnection → getUserConnection → getServerConfig runs without configServers, it returns undefined and throws. Fix by threading the pre-resolved serverConfig (providedConfig) from callTool through getConnection → getUserConnection → createUserConnectionInternal, using it as a fallback before the registry lookup. * fix: thread configServers through reinit, reconnect, and tool definition paths Wire configServers through every remaining call chain that creates or reconnects MCP server connections: - reinitMCPServer: accepts serverConfig and configServers, uses them for getServerConfig fallback, getConnection, and discoverServerTools - reconnectServer: accepts and passes configServers to reinitMCPServer - createMCPTools/createMCPTool: pass configServers to reconnectServer - ToolService.loadToolDefinitionsWrapper: resolves configServers from req, passes to both reinitMCPServer call sites - reinitialize route: passes serverConfig and configServers to reinitMCPServer * fix: address review findings — simplify merge, harden error paths, fix log labels - Simplify getAllServerConfigs merge: replace fragile reference-equality loop with direct spread { ...yamlConfigs, ...configServers, ...base } - Guard upsertConfigCache in lazyInitConfigServer catch block so cache failures don't mask the original inspection error - Deduplicate getYamlServerNames cold-start with promise dedup pattern - Remove dead `if (!mcpConfig)` guard in getMCPSetupData - Fix hardcoded "App server" in ServerConfigsCacheRedisAggregateKey error messages — now uses this.namespace for correct Config/App labeling - Remove misleading OAuth callback comment about readThrough cache - Move resolveConfigServers after module-level constants in MCP.js * fix: clear rejected yamlServerNames promise, fix config-source reinspect, fix reset log label - Clear yamlServerNamesPromise on rejection so transient cache errors don't permanently prevent ensureConfigServers from working - Skip reinspectServer for config-source servers (source: 'config') in reinitMCPServer — they lack a CACHE/DB storage location; retry is handled by CONFIG_STUB_RETRY_MS in ensureConfigServers - Use source field instead of dbId for storageLocation derivation - Fix remaining hardcoded "App" in reset() leaderCheck message * fix: persist oauthHeaders in flow state for config-source OAuth servers The OAuth callback route has no JWT auth context and cannot resolve config-source server configs. Previously, getOAuthHeaders would silently return {} for config-source servers, dropping custom token exchange headers. Now oauthHeaders are persisted in MCPOAuthFlowMetadata during flow initiation (which has auth context), and the callback reads them from the stored flow state with a fallback to the registry lookup for YAML/user-DB servers. * fix: update tests for getMCPSetupData null guard removal and ToolService mock - MCP.spec.js: update test to expect graceful handling of null mcpConfig instead of a throw (getAllServerConfigs always returns an object) - MCP.js: add defensive || {} for Object.entries(mcpConfig) in case of null from test mocks - ToolService.spec.js: add missing mock for ~/server/services/MCP (resolveConfigServers) * fix: address review findings — DRY, naming, logging, dead code, defensive guards - #1: Simplify getAllServerConfigs to single getBaseServerConfigs call, eliminating redundant double-fetch of cacheConfigsRepo.getAll() - #2: Add warning log when oauthHeaders absent from OAuth callback flow state - #3: Extract resolveAllMcpConfigs to MCP.js service layer; controller imports shared helper instead of reimplementing - #4: Rename _serverConfig/_provider to capturedServerConfig/capturedProvider in createToolInstance — these are actively used, not unused - #5: Log rejected results from ensureConfigServers Promise.allSettled so cache errors are visible instead of silently dropped - #6: Remove dead 'MCP config not found' error handlers from routes - #7: Document circular-dependency reason for dynamic require in clearMcpConfigCache - #8: Remove logger.error from withTimeout to prevent double-logging timeouts - #10: Add explicit userId guard in ServerConfigsDB.upsert with clear error message - danny-avila#12: Use spread instead of mutation in addServer for immutability consistency - Add upsert mock to ensureConfigServers.test.ts DB mock - Update route tests for resolveAllMcpConfigs import change * fix: restore correct merge priority, use immutable spread, fix test mock - getAllServerConfigs: { ...configServers, ...base } so userDB wins over configServers, matching documented "User DB (highest)" priority - lazyInitConfigServer: use immutable spread instead of direct mutation for parsedConfig.source, consistent with addServer fix - Fix test to mock getAllServerConfigs as {} instead of null, remove unnecessary || {} defensive guard in getMCPSetupData * fix: error handling, stable hashing, flatten nesting, remove dead param - Wrap resolveConfigServers/resolveAllMcpConfigs in try/catch with graceful {} fallback so transient DB/cache errors don't crash tool pipeline - Sort keys in configCacheKey JSON.stringify for deterministic hashing regardless of object property insertion order - Flatten clearMcpConfigCache from 3 nested try-catch to early returns; document that user connections are cleaned up lazily (accepted tradeoff) - Remove dead configServers param from getAppToolFunctions (never passed) - Add security rationale comment for source field in redactServerSecrets * fix: use recursive key-sorting replacer in configCacheKey to prevent cross-tenant cache collision The array replacer in JSON.stringify acts as a property allowlist at every nesting depth, silently dropping nested keys like headers['X-API-Key'], oauth.client_secret, etc. Two configs with different nested values but identical top-level structure produced the same hash, causing cross-tenant cache hits and potential credential contamination. Switch to a function replacer that recursively sorts keys at all depths without dropping any properties. Also document the known gap in getOAuthServers: config-source OAuth servers are not covered by auto-reconnection or uninstall cleanup because callers lack request context. * fix: move clearMcpConfigCache to packages/api to eliminate circular dependency The function only depends on MCPServersRegistry and MCPManager, both of which live in packages/api. Import it directly from @librechat/api in the CJS layer instead of using dynamic require('~/config'). * chore: imports/fields ordering * fix: address review findings — error handling, targeted lookup, test gaps - Narrow resolveAllMcpConfigs catch to only wrap ensureConfigServers so getAppConfig/getAllServerConfigs failures propagate instead of masking infrastructure errors as empty server lists. - Use targeted getServerConfig in getMCPServerById instead of fetching all server configs for a single-server lookup. - Forward configServers to inner createMCPTool calls so reconnect path works for config-source servers. - Update getAllServerConfigs JSDoc to document disjoint-key design. - Add OAuth callback oauthHeaders fallback tests (flow state present vs registry fallback). - Add resolveConfigServers/resolveAllMcpConfigs unit tests covering happy path and error propagation. * fix: add getOAuthReconnectionManager mock to OAuth callback tests * chore: imports ordering
peeeteeer
pushed a commit
that referenced
this pull request
May 6, 2026
…riming (danny-avila#12649) * feat: Skill runtime integration — catalog injection, tool registration, execute handler Wires the @librechat/agents SkillTool primitive into LibreChat's agent runtime: **Enums:** - Add `skills` to AgentCapabilities + defaultAgentCapabilities **Data layer:** - Add `getSkillByName(name, accessibleIds)` — compound query that combines name lookup + ACL check in one findOne **Agent initialization (packages/api/src/agents/initialize.ts):** - Accept `accessibleSkillIds` param and `listSkillsByAccess` db method - Query accessible skills, format catalog via `formatSkillCatalog()`, append to `additional_instructions` (appears in agent system prompt) - Register `SkillToolDefinition` + `createSkillTool()` when catalog is non-empty (tool appears in model's tool list) - Store `accessibleSkillIds` and `skillCount` on InitializedAgent **Execute handler (packages/api/src/agents/handlers.ts):** - Add `getSkillByName` to `ToolExecuteOptions` - `handleSkillToolCall()` intercepts `Constants.SKILL_TOOL`: extracts skillName, loads body from DB with ACL check, substitutes $ARGUMENTS, returns ToolExecuteResult with injectedMessages (skill body as isMeta user message) **Caller wiring:** - initialize.js: query skill IDs via findAccessibleResources, pass to initializeAgent + store on agentToolContexts, add getSkillByName to toolExecuteOptions, pass accessibleSkillIds through loadTools configurable - openai.js + responses.js: same pattern for their flows Requires @librechat/agents >= 3.1.65 (PR danny-avila#91 exports). * feat: Skills toggle in tools menu + backend capability gating Frontend: - Add skills?: boolean to TEphemeralAgent type - Add LAST_SKILLS_TOGGLE_ to LocalStorageKeys for persistence - Add skillsEnabled to useAgentCapabilities hook - Add skills useToolToggle to BadgeRowContext with localStorage init - New Skills.tsx badge component (Scroll icon, cyan theme, permission-gated via PermissionTypes.SKILLS) - Add skills entry to ToolsDropdown with toggle + pin - Render Skills badge in BadgeRow ephemeral section Backend: - Extract injectSkillCatalog() into packages/api/src/agents/skills.ts (reduces initializeAgent module size, reusable helper) - initializeAgent delegates to helper instead of inline block - Capability-gate the findAccessibleResources query: - Agents endpoint: checks AgentCapabilities.skills in admin config - OpenAI/Responses controllers: checks ephemeralAgent.skills toggle - ACL query runs once per run, result shared across all agents * refactor: remove createSkillTool() instance from injectSkillCatalog SkillTool is event-driven only. The tool definition in toolDefinitions is sufficient for the LLM to see the tool schema. No tool instance is needed since the host handler intercepts via ON_TOOL_EXECUTE before tool.invoke() is ever called. Removes tools from InjectSkillCatalogParams/Result, drops the createSkillTool import. * feat: skill file priming, bash tool, and invoked skills state Multi-file skill support: - New primeSkillFiles() helper (packages/api/src/agents/skillFiles.ts) uploads skill files + SKILL.md body to code execution environment - handleSkillToolCall primes files on invocation when skill.fileCount > 0, returns session info as artifact so ToolNode stores the session - Skill-primed files available to subsequent bash/code tool calls Bash tool auto-registration: - BashExecutionToolDefinition added alongside SkillToolDefinition when skills are enabled, giving the model a bash tool for running scripts Conversation state: - Add invokedSkillIds field to conversation schema (Mongoose + Zod) - handleSkillToolCall updates conversation with $addToSet on success - Enables re-priming skill files on subsequent runs (future) Dependency wiring: - Pass listSkillFiles, getStrategyFunctions, uploadCodeEnvFile, updateConversation through ToolExecuteOptions - Pass req and codeApiKey through mergedConfigurable - All three controller entry points wired (initialize.js, openai.js, responses.js) * fix: load bash_tool instance in loadToolsForExecution, remove file listing - Add createBashExecutionTool to loadToolsForExecution alongside PTC/ToolSearch pattern: loads CODE_API_KEY, creates bash tool instance on demand - Add BASH_TOOL and SKILL_TOOL to specialToolNames set so they don't go through the generic loadTools path (bash is created here, skill is intercepted in handler before tool.invoke) - Remove file name listing from skill content text — it's the skill author's responsibility to disclose files in SKILL.md, not the framework * feat: batch upload for skill files, replace sequential uploads - Add batchUploadCodeEnvFiles() to crud.js: single POST to /upload/batch with all files in one multipart request, returns shared session_id - Rewrite primeSkillFiles to collect all streams (SKILL.md + bundled files) then do one batch upload instead of N sequential uploads - Replace uploadCodeEnvFile with batchUploadCodeEnvFiles across all callers (handlers.ts, initialize.js, openai.js, responses.js) * refactor: remove invokedSkillIds from conversation schema Skills aren't re-loaded between runs, so conversation-level state for invoked skills doesn't help. Skill state will live on messages instead (like tool_search discoveredTools and summaries), enabling in-place re-injection on follow-up runs. Removes invokedSkillIds from: convo Mongoose schema, IConversation interface, Zod schema, ToolExecuteOptions.updateConversation, and all three caller wiring points. * feat: smart skill file re-priming with session freshness checking Schema: - Add codeEnvIdentifier field to ISkillFile (type + Mongoose schema) - Add updateSkillFileCodeEnvIds batch method (uses tenantSafeBulkWrite) - Export checkIfActive from Code/process.js Extraction: - Add extractInvokedSkillsFromHistory() to run.ts — scans message history for AIMessage tool_calls where name === 'skill', extracts skillName args. Follows same pattern as extractDiscoveredToolsFromHistory. Smart re-priming in primeSkillFiles: - Before batch uploading, checks if existing codeEnvIdentifiers are still active via getSessionInfo + checkIfActive (23h threshold) - If session is still active, returns cached references (zero uploads) - If stale or missing, batch-uploads everything and persists new identifiers on SkillFile documents (fire-and-forget) - Single session check covers all files (batch shares one session_id) Wiring: - Pass getSessionInfo, checkIfActive, updateSkillFileCodeEnvIds through ToolExecuteOptions and all three controller entry points * feat: wire skill file re-priming at run start via initialSessions Flow: 1. initialize.js creates primeInvokedSkills callback with all deps 2. client.js calls it with message history before createRun 3. extractInvokedSkillsFromHistory scans for skill tool calls 4. For each invoked skill with files, primeSkillFiles uploads/checks 5. Returns initialSessions map passed to createRun 6. createRun passes initialSessions to Run.create (via RunConfig) 7. Run constructor seeds Graph.sessions, making skill files available to subsequent bash/code tool calls via ToolNode session injection Requires @librechat/agents with initialSessions on RunConfig (PR danny-avila#94). * refactor: use CODE_EXECUTION_TOOLS set for code tool checks Import CODE_EXECUTION_TOOLS from @librechat/agents and replace inline constant checks in handlers.ts and callbacks.js. Fixes missing bash tool coverage in the session context injection (handlers.ts) and code output processing (callbacks.js). * refactor: move primeInvokedSkills to packages/api, add skill body re-injection Moves primeInvokedSkills from an inline closure in initialize.js (with dynamic requires) to a proper exported function in packages/api skillFiles.ts with explicit typed dependencies. Key changes: - primeInvokedSkills now returns both initialSessions (for file priming) AND injectedMessages (skill bodies for context continuity) - createRun accepts invokedSkillMessages and appends skill bodies to systemContent so the model retains skill instructions across runs - initialize.js calls the packaged function with all deps passed explicitly - client.js passes both initialSessions and injectedMessages to createRun * fix: move dynamic requires to top-level module imports Move primeInvokedSkills, getStrategyFunctions, batchUploadCodeEnvFiles, getSessionInfo, and checkIfActive from inline requires to top-level module requires where they belong. * refactor: skill body reconstruction via formatAgentMessages, not systemContent Replaces the lazy systemContent approach with proper message-level reconstruction: SDK (formatAgentMessages): - New invokedSkillBodies param (Map<string, string>) - Reconstructs HumanMessages after skill ToolMessages at the correct position in the message sequence, matching where ToolNode originally injected them LibreChat: - extractInvokedSkillsFromPayload replaces extractInvokedSkillsFromHistory (works with raw TPayload before formatAgentMessages, not BaseMessage[]) - primeInvokedSkills now takes payload instead of messages, returns skillBodies Map instead of injectedMessages - client.js calls primeInvokedSkills BEFORE formatAgentMessages, passes skillBodies through as the 4th param - Removed invokedSkillMessages from createRun (no more systemContent hack) - Single-pass: skill detection happens inside formatAgentMessages' existing tool_call processing loop, zero extra message iterations * refactor: rename skillBodies to skills for consistency with SDK param * refactor: move auth loading into primeInvokedSkills, pass loadAuthValues as dep The payload/accessibleSkillIds guard and CODE_API_KEY loading now live inside primeInvokedSkills (packages/api) rather than in the CJS caller. initialize.js passes loadAuthValues as a dependency and the callback is only created when skillsCapabilityEnabled. * feat: ReadFile tool + conditional bash registration + skill path namespacing ReadFile tool (read_file): - General-purpose file reader, event-driven (ON_TOOL_EXECUTE) - Schema: { file_path: string } — "{skillName}/{path}" convention - handleReadFileCall: resolves skill name from path, ACL check, reads from DB cache or storage, binary detection, size limits (256KB), lazy caching (512KB), line numbers in output - SKILL.md special case: reads skill.body directly - Dispatched alongside SKILL_TOOL in createToolExecuteHandler - Added to specialToolNames in ToolService Conditional tool registration: - ReadFile + SkillTool: always registered when skills enabled - BashTool: only registered when codeEnvAvailable === true - codeEnvAvailable passed through InitializeAgentParams from caller Skill file path namespacing: - primeSkillFiles now uploads as "{skillName}/SKILL.md" and "{skillName}/{relativePath}" instead of flat names - Prevents file collisions when multiple skills are invoked Wiring: - getSkillFileByPath + updateSkillFileContent passed through ToolExecuteOptions in all three callers * feat: return images/PDFs as artifacts from read_file, tighten caching Binary artifact support: - Images (png, jpeg, gif, webp) returned as base64 in artifact.content with type: 'image_url', processed by existing callback attachment flow - PDFs returned as base64 artifact similarly - Binary size limit: 10MB (MAX_BINARY_BYTES) - Other binary files still return metadata + bash fallback Caching: - Text cached only on first read (file.content == null check) - Binary flag cached only on first detection (file.isBinary == null) - Skill files are immutable; no redundant cache writes Registration: - ReadFileToolDefinition now includes responseFormat: 'content_and_artifact' * chore: update @librechat/agents to version 3.1.66-dev.0 and add peer dependencies in package-lock.json and package.json files * fix: resolve review findings #1,#2,#4,#5,#6,#10,danny-avila#13 Critical: - #1: primeInvokedSkills now accumulates files across all skills into one session entry instead of overwriting. Parallel processing via Promise.allSettled. - #2: codeEnvAvailable now computed and passed in openai.js and responses.js (was missing, bash tool never registered in those flows) Major: - #4: relativePath in updateSkillFileCodeEnvIds now strips the {skillName}/ prefix to match SkillFile documents. SKILL.md filter uses endsWith instead of exact match. - #5: File priming guarded on apiKey being non-empty (skip when not configured instead of failing with auth error) - #6: Skills processed in parallel via Promise.allSettled instead of sequential for-of loop Minor: - #10: Use top-level imports in initialize.js instead of inline requires - danny-avila#13: Log warning when skill catalog reaches the 100-skill limit * fix: resolve followup review findings N1,N2,N4 N1 (CRITICAL): Wire skill deps into responses.js non-streaming path. Was completely missing getSkillByName, file strategy functions, etc. N2 (MAJOR): Single batch upload for ALL skills' files. Resolves skills in parallel (Phase 1), then collects all file streams across skills and does ONE batchUploadCodeEnvFiles call (Phase 2). All files share one session_id, eliminating cross-session isolation issues. N4 (MINOR): Move inline require() to top-level in openai.js and responses.js, consistent with initialize.js. * fix: add mocks for new file strategy imports in controller tests * fix: restore session freshness check, parallelize file lookups, add warnings R1: Re-add session freshness check before batch upload. Checks any existing codeEnvIdentifier via getSessionInfo + checkIfActive. If the session is still active (23h window), returns cached file references with zero re-uploads. R2: listSkillFiles calls parallelized via Promise.all (were sequential in the for-of loop). R3: Log warning when skill record lookup fails during identifier persistence (was a silent empty-string fallback). * fix: guard freshness cache on single-session consistency * fix: multi-session freshness check (code env handles mixed sessions natively) The code execution environment fetches each file by its own {session_id, fileId} pair independently — no single-session requirement. Removed the sessionIds.size === 1 guard. Now checks ALL distinct sessions for freshness. If every session is still active (23h window), returns cached references with per-file session_ids preserved. If any session expired, falls through to re-upload everything in a single batch. * perf: parallelize session freshness checks via Promise.all * fix: add optional chaining for session info retrieval in primeInvokedSkills Updated the primeInvokedSkills function to use optional chaining for getSessionInfo and checkIfActive methods, ensuring safer access and preventing potential runtime errors when these methods are undefined. * fix: address review findings #1-#9 + Codex P1/P2 + session probe Critical: - #1/Codex P1: Add codeApiKey loading to openai.js and responses.js loadTools configurable (was missing, file priming broken in 2/3 paths) - Codex P1: Fix cached file name prefix in primeSkillFiles cache path (was sf.relativePath, now ${skill.name}/${sf.relativePath}) Major: - Codex P2: Honor ephemeral skills toggle in agents endpoint (check ephemeralAgent?.skills !== false alongside admin capability) - #4: Early size check using file.bytes from DB before streaming (prevents full-file buffer for oversized files) Minor: - #5: Replace Record<string, any> with Record<string, boolean | string> - #6: Localize Pin/Unpin aria-labels with com_ui_pin/com_ui_unpin - #8: Parallelize stream acquisition in primeSkillFiles via Promise.allSettled - #9: Log warning for partial batch upload failures with filenames Performance: - Session probe optimization: getSessionInfo now hits per-object endpoint (GET /sessions/{sid}/objects/{fid}) instead of listing entire session (GET /files/{sid}?detail=summary). O(1) stat vs O(N) list + linear scan. * refactor: extract shared skill wiring helper + add unit tests DRY (#3): - New skillDeps.js exports getSkillToolDeps() with all 9 skill-related deps (getSkillByName, listSkillFiles, getStrategyFunctions, etc.) - Replaces 5 identical copy-paste blocks across initialize.js, openai.js, responses.js (streaming + non-streaming paths) - One place to maintain when skill deps change Tests (#2): - 8 unit tests for extractInvokedSkillsFromPayload covering: string args, object args, missing skill tool_calls, non-assistant messages, malformed JSON, empty skillName, empty payload, dedup * fix: remove @jest/globals import, use global jest env * fix: resolve round 2 review findings R2-1 through R2-7 R2-1 (toggle semantics): openai.js + responses.js now check admin capability (AgentCapabilities.skills) alongside ephemeral toggle. Aligns with initialize.js. R2-2 (swallowed error): primeInvokedSkills now logs updateSkillFileCodeEnvIds failures (was .catch(() => {})) R2-4 (test cast): Record<string, string> → Record<string, unknown> R2-5 (DRY regression): Extract enrichWithSkillConfigurable() into skillDeps.js. Replaces 4 identical loadAuthValues blocks. Each loadTools callback is now a one-liner. JSDoc added (R2-6). R2-7 (sequential streams): primeInvokedSkills now uses Promise.allSettled for parallel stream acquisition. * fix: require explicit skills toggle + treat partial cache as miss - initialize.js: change ephemeralSkillsToggle !== false to === true (unset toggle no longer enables skills) - primeSkillFiles cache: require ALL files to have codeEnvIdentifier before using cache (partial persistence = cache miss = re-upload) - primeInvokedSkills cache: same check (allFilesWithIds.length must equal total file count) * fix: pass entity_id=skillId on batch upload, eliminates per-user cache thrashing primeSkillFiles now passes entity_id: skill._id.toString() to batchUploadCodeEnvFiles. This scopes the code env session to the skill, not the user. All users sharing a skill share the same uploaded files — no more cache thrashing from overwriting each other's codeEnvIdentifier. The stored codeEnvIdentifier now includes ?entity_id= suffix so freshness checks pass the entity_id through to the per-object stat endpoint. Both primeSkillFiles and primeInvokedSkills store consistent identifier formats. * fix: pass entity_id on multi-skill batch upload, consistent identifier format * Revert "fix: pass entity_id on multi-skill batch upload, consistent identifier format" This reverts commit c85ce21. * refactor: per-skill upload in primeInvokedSkills, eliminate multi-skill batch Replace the monolithic multi-skill batch upload with per-skill primeSkillFiles calls. Each skill gets its own session with entity_id=skillId, ensuring: - Correct session auth (entity_id matches on freshness checks) - Per-skill freshness caching (only expired skills re-upload) - Shared skill sessions work across users (same entity_id=skillId) - Code env handles mixed session_ids natively The big batch block (stream collection, single upload, identifier mapping) is replaced by a simple loop over primeSkillFiles, which already handles freshness caching, batch upload, and identifier persistence per-skill. * fix: resolve review findings #1,#3-5,#7,#9-11 Critical: - #1: Strip ?entity_id= query string before splitting codeEnvIdentifier into session_id/fileId (was corrupting cached file IDs in 4 locations) Major: - #4: Parallelize per-skill primeSkillFiles via Promise.allSettled - #5: Add logger.warn to all empty .catch(() => {}) on cache writes Minor: - #7: Add logger.debug to enrichWithSkillConfigurable catch block - #9: Use error instanceof Error guard in batchUploadCodeEnvFiles - #10: Move enrichWithSkillConfigurable to TypeScript in packages/api (skillConfigurable.ts), skillDeps.js wraps with loadAuthValues dep - #11: Reduce MAX_BINARY_BYTES from 10MB to 5MB (~11.5MB peak with b64) * fix: forward entity_id in session probe + always register bash tool Codex P2 (entity_id in probe): getSessionInfo now preserves and forwards query params (including entity_id) to the per-object stat endpoint. Without this, identifiers stored as ...?entity_id=... would fail auth checks because the entity_id scope was dropped. Codex P2 (bash tool availability): Remove codeEnvAvailable gate from injectSkillCatalog. Bash tool definition is now always registered when skills are enabled. Actual tool instance creation still happens at execution time in loadToolsForExecution (which loads per-user credentials). This ensures users with per-user CODE_API_KEY get bash without requiring a global env var at init time. Removes codeEnvAvailable from InjectSkillCatalogParams, InitializeAgentParams, and all three controller entry points. * fix: add debug logging to primeInvokedSkills catch, rename export alias * fix: stub bash tool when no key + remove PDF artifact path Codex P1 (bash tool): When CODE_API_KEY is unavailable, create a stub tool that returns "Code execution is not available. Use read_file instead." This prevents "tool not found" errors from the model repeatedly calling bash_tool in no-code-env deployments while still registering the definition for per-user credential users. Codex P2 (PDF artifacts): Remove PDF image_url artifact path. The host artifact pipeline processes image_url via saveBase64Image which fails for PDFs. PDFs now fall through to the generic binary handler ("Use bash to process"). TODO comment for future document artifact support. Also: isImageOrPdf → isImage in early size checks (PDFs are no longer treated as artifact candidates). * fix: remove dead PDF_MIME constant, hoist skillToolDeps, document session_id - #7: Remove unused PDF_MIME constant (dead code after PDF artifact removal) - #11: Hoist skillToolDeps to module-level constant (avoid per-call allocation) - #6: Document that CodeSessionContext.session_id is a representative value; ToolNode uses per-file session_id from the files array * fix: call toolEndCallback for skill/read_file artifacts + clear codeEnvIdentifier on re-upload Codex P1 (toolEndCallback bypass): skill and read_file handler branches returned early, bypassing the toolEndCallback that processes artifacts (image attachments). Now calls toolEndCallback when the result has an artifact, using the same metadata pattern as the normal tool.invoke path. Codex P1 (stale identifiers): upsertSkillFile now $unset's codeEnvIdentifier alongside content and isBinary when a file is re-uploaded. Prevents the freshness cache from returning references to old file content after a skill file is replaced. * fix: add session_id comment at cached path, rename skillResult to handlerResult * fix: return content_and_artifact from bash stub so result.content is populated * fix: deterministic skill lookup, dedup warning, and multi-session freshness check - getSkillByName: add sort({updatedAt:-1}) so name collisions resolve deterministically to the most recently updated skill - injectSkillCatalog: warn when multiple accessible skills share a name - primeSkillFiles: check ALL distinct sessions for freshness, not just the first file's session, preventing stale refs after partial bulkWrite * refactor: update icon import in Skills component - Replaced the Scroll icon with ScrollText in the Skills component for improved clarity and consistency in the UI. * fix: SKILL.md cache parity, gate bash_tool on code env, fix read_file too-large message - primeSkillFiles: filter SKILL.md from returned files array on fresh upload so cached and non-cached paths return identical file sets (SKILL.md is still on disk in the session for bash access) - injectSkillCatalog: only register bash_tool when codeEnvAvailable is true; thread the flag from all three CJS callers via execute_code capability check - handleReadFileCall: tell the model to invoke the skill first before suggesting /mnt/data paths for oversized files * fix: use EnvVar constant, deduplicate auth lookup, validate batch upload, stream byte limit - Replace hardcoded 'LIBRECHAT_CODE_API_KEY' with EnvVar.CODE_API_KEY in skillConfigurable.ts and skillFiles.ts - Resolve code API key once at run start in initialize.js and pass to both primeInvokedSkills and enrichWithSkillConfigurable via optional preResolvedCodeApiKey param, eliminating redundant loadAuthValues calls - Add response structure validation in batchUploadCodeEnvFiles before accessing session_id/files to surface unexpected responses early - Add streaming byte counter in handleReadFileCall that aborts and destroys the stream when accumulated bytes exceed MAX_BINARY_BYTES, preventing full file buffering when DB metadata is inaccurate * refactor: update icon import in ToolsDropdown component - Replaced the Scroll icon with ScrollText in the ToolsDropdown component for improved clarity and consistency in the UI. * fix: partial upload failure detection, EnvVar in initialize.js, declaration ordering - primeSkillFiles: return null (failure) when batch upload partially succeeds — missing bundled files would cause runtime bash/read failures with missing paths in code env - initialize.js: replace hardcoded 'LIBRECHAT_CODE_API_KEY' with EnvVar.CODE_API_KEY imported from @librechat/agents - initialize.js: move enabledCapabilities, accessibleSkillIds, and codeApiKey declarations before the toolExecuteOptions closure that references them (eliminates reliance on temporal dead zone hoisting)
peeeteeer
pushed a commit
that referenced
this pull request
May 6, 2026
…efaults (danny-avila#12692) * feat: per-user skill active/inactive toggle with ownership-aware defaults - Add `skillStates` map (Record<string, boolean>) to user schema for per-user active/inactive overrides on skills - Add `defaultActiveOnShare` to interface.skills config (default: false) so admins can control whether shared skills auto-activate - Add GET/POST /api/user/settings/skills/active endpoints with validation - Add React Query hooks with optimistic mutations for skill states - Add useSkillActiveState hook with ownership-aware resolution: owned skills default active, shared skills default inactive - Add toggle switch UI to SkillListItem and SkillDetail components - Filter inactive skills in injectSkillCatalog before agent injection - Add localization keys for active/inactive labels * fix: use Record instead of Map for IUser.skillStates Mongoose .lean() flattens Map to a plain object, causing type incompatibility with IUser in methods that return lean documents. * fix: address review findings for skill active states - Fail-closed when userId is absent: filter rejects all shared skills instead of passing them through unfiltered (Codex P1) - Validate Mongoose Map key characters (reject . and $) in controller to return 400 instead of a 500 from schema validation (Codex P2) - Block toggle while initial skill states query is loading to prevent overwriting server-side overrides with an empty snapshot (Codex P2) - Extract shared SkillToggle component, eliminating duplicate toggle markup in SkillListItem and SkillDetail (Finding #3) - Move skill state query/mutation hooks from Favorites.ts to Skills/queries.ts per feature-directory convention (Finding #4) - Fix hardcoded English aria-label in SkillListItem by passing the localized string from the parent SkillList (Finding #5) - Fix inline arrow in SkillList render loop: pass stable callback reference so SkillListItem memo() is not invalidated (Finding #1) - Extract toRecord() helper in controller to DRY the Map-to-Object conversion (Finding #6) - Remove Promise.resolve wrapping synchronous config read (Finding #8) - Remove unused TUpdateSkillStatesRequest type (Finding danny-avila#12) * fix: forward tabIndex on SkillToggle to preserve list keyboard nav The original inline toggle had tabIndex={-1} so the row itself remained the sole tab target. The extraction into SkillToggle dropped this prop, making every list toggle a tab stop. Add an optional tabIndex prop and pass -1 from SkillListItem. * fix: plumb skillStates to all agent entry points, isolate toggle keydown - Add skillStates/defaultActiveOnShare loading to openai.js and responses.js controllers so shared-skill activation is respected across all agent entry points, not just initialize.js (Codex P1) - Stop keydown propagation on SkillToggle so Enter/Space does not bubble to the parent row's navigation handler (Codex P2) * fix: paginate catalog fetch and serialize toggle writes - Paginate listSkillsByAccess (up to 10 pages of 100) until the active catalog quota is filled, so inactive shared skills in recent positions do not starve active owned skills past the first page (Codex P1) - Extend listSkillsByAccess interface with cursor/has_more/after for catalog pagination - Serialize skill-state writes via a ref queue: one in-flight request at a time, with the latest desired state sent when the previous one settles. Prevents last-response-wins races where an older request overwrites newer toggles (Codex P2) * fix: share write queue across hook instances, block toggle on fetch error - Move the write queue from a per-instance useRef to a module-scoped object so every mount of useSkillActiveState (SkillList, SkillDetail, etc.) serializes against the same in-flight slot. Prior per-instance queues allowed two components to race full-map POSTs (Codex P1) - Extend the toggle guard beyond isLoading: also block when isError is true or data is undefined. Prevents a failed GET from seeding a toggle with an empty baseline that would wipe server-side overrides on the next successful POST (Codex P1) * fix: stale closure, orphan cleanup, and cap-error UX - Read toggle baseline from React Query cache via queryClient.getQueryData instead of the captured skillStates closure. The closure can be stale between onMutate's setQueryData and the next render, so rapid successive toggles would build on old state and drop earlier changes (Codex P1) - Surface the MAX_SKILL_STATES_EXCEEDED error code with a specific toast key (com_ui_skill_states_limit) so users understand the 200-cap rather than seeing a generic error - Prune orphaned entries (skillIds whose Skill doc no longer exists) on both GET and POST in SkillStatesController. Self-heals over time without needing cascade-delete hooks or a migration job. Uses one indexed Skill._id query per request * test: pin skill active-state precedence with unit tests Extract the active-state resolution logic from a closure inside injectSkillCatalog into an exported resolveSkillActive helper, then cover every branch of the precedence matrix: - Fails closed when userId is absent (even with defaultActiveOnShare=true) - Explicit override wins over ownership and config (both true and false) - Owned skills default to active when no override is set - Shared skills default to defaultActiveOnShare value - Undefined skillStates behaves identically to an empty object - defaultActiveOnShare defaults to false when omitted - Owned skills ignore defaultActiveOnShare entirely Closes Finding #2 from the pre-rebase comprehensive review. Mirrors the existing scopeSkillIds test style; injectSkillCatalog now calls resolveSkillActive instead of inlining the closure. * refactor: limit skill active toggle to detail header, drop label - Remove the per-row toggle from SkillListItem and the active-state plumbing (hook call, isSkillEnabled/onToggleEnabled/toggleAriaLabel props) from SkillList. The detail view is now the single place to change a skill's active state - Drop dim/muted styling for inactive skills in the sidebar: without a control there, the visual indication has nowhere to land - Resize SkillToggle to match neighbor buttons: outer h-9 container, h-6 w-11 track with size-5 knob, no label span. The 'Active' / 'Inactive' text that accompanied the detail-view toggle is removed - Remove the now-unused label prop and tabIndex prop (the tabIndex existed only for the list-row context) from SkillToggle. Drop the onKeyDown stopPropagation for the same reason - Remove now-orphaned com_ui_skill_active / com_ui_skill_inactive translation keys * style: shrink SkillToggle track to h-5 w-9 with size-4 knob Container stays at h-9 to match neighbor button heights. The toggle track itself drops from h-6 w-11 to h-5 w-9, with a size-4 knob travelling 1.125rem on activation. Visually lighter inside the row. * fix: remove redundant skillStates entries that match the resolved default When a toggle lands on the ownership/config default, delete the key from the map instead of persisting `{id: defaultValue}`. Without this, a user toggling a skill off and back on would leave `{id: true}` for an owned skill (whose default is already true), silently consuming a slot against the 200-entry cap. Repeated round-trip toggles could exhaust the quota with zero meaningful overrides (Codex P2). Preserves the exceptions-list invariant that the runtime-resolution design depends on. * fix: prune before enforcing skill-state cap; reject non-ObjectId keys Reorder the update controller so pruneOrphans runs before the 200-cap check. Without this, a user near the cap with some orphaned entries (skills deleted since their last GET) could send a payload that would pass after pruning but gets rejected by the raw-size check first. Add a sanity cap on raw payload size (2 * MAX_SKILL_STATES) so abusive inputs do not reach the DB query, and enforce the real cap on the pruned result instead. Harden pruneOrphans: the earlier early-return path could pass non-ObjectId keys through unchanged. Now only valid ObjectIds are returned, and the Skill-model-unavailable fallback filters by format. Also add isValidObjectIdString validation at the input boundary so malformed (but otherwise non-Mongo-unsafe) keys never reach persistence (Codex P2 x2). * fix: enforce active filter at execute time, prune revoked shares, scope queue per user P1: injectSkillCatalog now returns activeSkillIds (the filtered set that appears in the catalog). initializeAgent uses that set as the stored accessibleSkillIds on the initialized agent, so getSkillByName at runtime cannot resolve a deactivated skill — even if the LLM hallucinates a name or the user invokes by direct-invocation shorthand. Previously the executor authorized against the full ACL set, bypassing the active-state guarantee (Codex P1). P2: pruneOrphans now checks user access via findAccessibleResources in addition to skill existence. When a share is revoked, the user's skillStates entry for that skill had no cleanup path and silently consumed the 200-cap. Self-heals on both GET and POST. One extra ACL query per settings read/write; scoped to a single user so no N-user amplification (Codex P2). P2: the write queue moves from a single module-scoped object to a Map keyed by userId. Logout/login in the same tab can no longer flush the previous user's pending snapshot under the new session's auth. Each userId gets its own pending/inFlight slot; the in-flight request retains its original auth via the cookie already attached when sent, so the race window closes (Codex P2). * refactor: extract skillStates helpers to packages/api; add tests; polish Address the remaining valid findings from the comprehensive review: - Extract toRecord, loadSkillStates, validateSkillStatesPayload, and pruneOrphanSkillStates into packages/api/src/skills/skillStates.ts as TypeScript. The controller in /api shrinks to a ~90-line thin wrapper that builds live dependency adapters for Mongoose + the permission service (Review #2 DRY, #3 workspace boundary) - Replace the triplicated 12-line skillStates loading block in initialize.js, openai.js, and responses.js with a single call to loadSkillStates from @librechat/api. One helper, three sites - Swap console.error for the project logger in the controller (Review #7) - Remove the redundant INVALID_KEY_PATTERN regex: a valid ObjectId cannot contain . or $, so isValidObjectIdString already covers it (Review #11) - Parameterize the 200-cap error toast with {{0}} interpolation driven by the error response's `limit` field, so future changes to MAX_SKILL_STATES update the UI message automatically (Review danny-avila#12) - Add 24 unit tests for the new skillStates helpers (toRecord, resolveDefaultActiveOnShare, loadSkillStates, validateSkillStates- Payload, pruneOrphanSkillStates) covering success paths, malformed input, cap boundaries, and parallel-query behavior (Review #4) - Add 10 tests for injectSkillCatalog pagination covering empty accessible set, missing listSkillsByAccess, single-page filter, owned-vs-shared defaults, explicit-override precedence, multi-page collection, MAX_CATALOG_PAGES safety cap, early termination on has_more=false, additional_instructions injection, and fail-closed without userId (Review #5) Total test count: 60 (was 26 on this surface). * fix: rename skillStates ValidationError to avoid barrel-export collision packages/api/src/types/error.ts already exports a ValidationError (MongooseError extension). Re-exporting a different shape from skills/skillStates.ts through the skills barrel caused TS2308 in CI because the root index re-exports both. Rename to SkillStatesValidationError to keep the exports disjoint. * refactor: tighten tests and absorb caller guard into loadSkillStates Address the followup review findings: - Add optional `accessibleSkillIds` param to loadSkillStates so the helper short-circuits to defaults when no skills are accessible. All three controllers drop the residual 7-line conditional wrapper in favor of a single destructured call (Review #2) - Remove the unreachable `typeof key !== 'string'` check from validateSkillStatesPayload: Object.entries always yields string keys per the JS spec (Review #3) - Replace the two `as unknown as` agent casts in the injectSkillCatalog tests with a `makeAgent()` factory typed directly as the function's parameter shape (Review #4) - Tighten the MAX_CATALOG_PAGES assertion from `toBeLessThanOrEqual(11)` to `toHaveBeenCalledTimes(10)` — the loop deterministically makes exactly 10 page fetches before hitting the cap (Review #1) - Rewrite the parallel-execution test for pruneOrphanSkillStates using deferred promises instead of microtask-order assertions. The test now inspects `toHaveBeenCalledTimes(1)` on both mocks after a single Promise.resolve() yield, pinning Promise.all usage without relying on push-order into a shared array (Review #5) - Evict stale writeQueue entries on user change via a module-scoped `lastSeenUserId` sentinel. When a different user's toggle is the first one after a logout/login, the previous user's queue entry is deleted. Keeps the Map bounded without adding hook-instance effect cleanup (Review #6) * fix(test): mock loadSkillStates in openai and responses controller specs The prior refactor replaced the inline 12-line skillStates loading block with a call to loadSkillStates from @librechat/api. Both controller spec files mock @librechat/api as a flat object, so any new named import from that package is undefined in the test env. Calling `await loadSkillStates(...)` threw before recordCollectedUsage ran, surfacing as "undefined is not iterable" on the test's array destructure of `mockRecordCollectedUsage.mock.calls[0]`. Add the missing mock to both spec files alongside the existing scopeSkillIds stub. * fix: abandon stale skillStates write queues on user switch Close the cross-session leak window where an in-flight flush loop still holds a reference to a previous user's queue: it could fire its next mutateAsync under the new session's auth cookies and persist the stale snapshot to the new user's document (Codex P1). Add an `abandoned` flag on `WriteQueue`. Three mechanisms cooperate: - `getWriteQueue` marks every non-active queue abandoned when the user differs from the last-seen identity (pre-existing eviction site, now more aggressive). - A `useEffect` on `userId` calls the same abandonment pass on every render with a new active identity, covering the window between logout/login and the new user's first toggle (when `getWriteQueue` would otherwise not fire). - The flush loop checks `!queue.abandoned` in its while condition so the second and later iterations exit without firing another `mutateAsync` after the session changes. The first iteration's in-flight request (already dispatched under the original user's cookies) still runs to completion or failure on its own — only the subsequent iterations, which are the dangerous ones, are blocked.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements a Deep Research Mode feature for LibreChat agents, enabling comprehensive research capabilities with visible reasoning and thought processes.
Key Changes:
DeepResearchtoggle component in the chat interfaceMotivation:
This feature addresses the need for more thorough, well-researched responses by encouraging agents to:
Commits:
Change Type
Testing
Test Configuration:
Environment:
Test Steps:
Expected Results:
Checklist
Marouen Mhiri marouen.mhiri@exxeta.com on behalf of Daimler Truck AG.
Provider & Legal Notice | Daimler Truck
Copyright (c) {2025} Daimler Truck AG