Skip to content

fix(ingestion): classify request-like client calls#1744

Open
JaysonAlbert wants to merge 7 commits into
abhigyanpatwari:mainfrom
JaysonAlbert:codex/request-like-clients-refresh
Open

fix(ingestion): classify request-like client calls#1744
JaysonAlbert wants to merge 7 commits into
abhigyanpatwari:mainfrom
JaysonAlbert:codex/request-like-clients-refresh

Conversation

@JaysonAlbert
Copy link
Copy Markdown
Contributor

Background

This refreshes the still-useful part of closed, unmerged PR #698 for issue #692 against current main. Current main prevents common request-like member calls from becoming provider routes, but imported request.get/post/put/delete/patch(...) calls still do not become consumer FETCHES edges.

Changes

  • Add shared request-like import/member-call detection for umi-request and local request helpers.
  • Wire the helper into both worker parsing and sequential fallback fetch-call extraction.
  • Add regression fixture coverage proving imported request.get('/api/items') creates FETCHES while request.post('/api/client-only') does not create Route / HANDLES_ROUTE provider artifacts.

Validation

  • cd gitnexus && npx vitest run test/integration/resolvers/express-routes.test.ts
  • cd gitnexus && npx tsc --noEmit
  • cd gitnexus && npm run format:check
  • node gitnexus/dist/cli/index.js analyze --force --index-only --name request-like-clients-refresh .
  • node gitnexus/dist/cli/index.js detect-changes --repo request-like-clients-refresh (medium risk; expected ingestion fetch-call symbols)

Notes

Issue #692 is already closed, but the full consumer-link behavior from #698 was not present on current main.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 21, 2026

Someone is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

CI Report

All checks passed

Pipeline Status

Stage Status Details
✅ Typecheck success tsc --noEmit
✅ Tests success unit tests, 3 platforms
✅ E2E success gitnexus-web changes only

Test Results

Tests Passed Failed Skipped Duration
10071 10066 0 5 640s

✅ All 10066 tests passed

5 test(s) skipped — expand for details
  • COBOL pipeline benchmark > scales with file count
  • PHP pipeline benchmark > scales with file count (workers enabled)
  • Ruby pipeline benchmark > scales with file count (workers enabled)
  • Rust pipeline benchmark > scales with file count (workers enabled)
  • buildTypeEnv > known limitations (documented skip tests) > Ruby block parameter: users.each { |user| } — closure param inference, different feature

Code Coverage

Tests

Metric Coverage Covered Base Delta Status
Statements 79.8% 35460/44435 79.48% 📈 +0.3 🟢 ███████████████░░░░░
Branches 68.36% 22608/33068 68% 📈 +0.4 🟢 █████████████░░░░░░░
Functions 84.83% 3664/4319 84.56% 📈 +0.3 🟢 ████████████████░░░░
Lines 83.35% 31948/38327 82.97% 📈 +0.4 🟢 ████████████████░░░░

📋 View full run · Generated by CI

@JaysonAlbert JaysonAlbert force-pushed the codex/request-like-clients-refresh branch from 5a7f89c to 8af46f3 Compare May 29, 2026 02:33
@JaysonAlbert JaysonAlbert force-pushed the codex/request-like-clients-refresh branch from 8af46f3 to 6656f35 Compare May 29, 2026 03:06
@JaysonAlbert JaysonAlbert marked this pull request as ready for review May 29, 2026 04:02
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

Claude finished @magyargergo's task in 9m 9s —— View job


Deep Production-Readiness Review — PR #1744

Tasks:

  • Read repo docs (DoD.md, AGENTS.md, GUARDRAILS.md)
  • Read all changed files in full
  • Read related source files (call-processor, parse-worker, tree-sitter-queries, pipeline, routes)
  • Run diff and hygiene checks
  • Reconstruct data flow from tree-sitter to FETCHES/Route/HANDLES_ROUTE
  • Assess worker vs sequential fallback parity
  • Assess test coverage
  • Assess performance overhead
  • Run hidden Unicode / bidi checks
  • Full review drafted

Review bar for this PR

Reviewed against DoD.md (v2.0.0), AGENTS.md, and the PR-specific DoD in the trigger. Minimum bar: correctness end-to-end on the real runtime path; architecture-safe; tests that fail loudly on regression; no unnecessary scans or hot-path overhead; no unrelated churn; no dead code that hides missing coverage.


Problem being solved

Issue #692: request.get('/api/users') from an imported umi-request or local ./request helper was not being emitted as a consumer FETCHES edge. The PR claims current main already prevents these calls from becoming provider Route/HANDLES_ROUTE artifacts (via the existing HTTP_CLIENT_RECEIVERS set in parse-worker.ts:913-932), but the consumer side was missing.


Current PR state

Open, ready for review, base abhigyanpatwari:main, head JaysonAlbert:codex/request-like-clients-refresh. Three commits: original fix 6656f35 plus two merge-from-main commits 7a3e8ca and c23a93d. CI bot reports all checks passed (10059 passed, 0 failed). The diff is 7 TypeScript files, +163/−3.


Merge status and mergeability

CI is green per the GitHub Actions bot. The PR branch is clean — git diff HEAD~3..HEAD --name-only confirms the only changed files are the 7 declared in the PR. No package-lock changes, no generated artifacts, no workflow files. Merge-from-main commits introduce Go builtInNames (d71fd16) and Spring interface route fix (7dae4fc) — neither touches the ingestion fetch path.


Repository history considered


Branch hygiene assessment

Pass with one concern. The two merge-from-main commits are harmless and scoped to Go and Spring, not to fetch-call extraction. The only out-of-scope change is cli-e2e.test.ts:319 (timeout 30000→60000), which is explained in inline comments at lines 311–317: the test-level vitest timeout must exceed the subprocess timeout to allow the slow-CI tolerance branch to activate. The justification is sound, but the change sits in a fix commit that is titled "classify request-like client calls" with no mention of it in the PR description. Technically a branch hygiene concern per DoD §3 (scope match); practically harmless.


Understanding of the change

Core change (request-like-clients.ts, +104 lines):
New helper with three pure-function exports consumed by both the worker and sequential paths:

  • collectRequestLikeImportBindings(content) — scans raw source text with a single import regex, returns a Set<string> of local binding names from umi-request or local ./request imports.
  • getRequestLikeMemberCallUrl(captureMap, bindings) — checks the receiver of an http_client or express_route capture against the binding set, validates method and URL prefix, and returns the URL string or null.

Wiring (call-processor.ts:3542 + parse-worker.ts:1099):
Both paths call collectRequestLikeImportBindings(file.content) per file, then call getRequestLikeMemberCallUrl inside the http_client capture branch. When a request-like URL is found, a fetchCalls entry is emitted. These are later consumed by processNextjsFetchRoutes in call-processor.ts:3437, which creates a FETCHES edge only if a matching Route node exists in the route registry.

Data flow confirmed correct for the primary case:
import request from './request'request.get('/api/items')http_client tree-sitter capture → collectRequestLikeImportBindings finds requestgetRequestLikeMemberCallUrl validates receiver+method+URL → ExtractedFetchCall pushed → processNextjsFetchRoutes matches against Route /api/items (which exists from app.js) → FETCHES edge created.

Route suppression confirmed: request.post('/api/client-only') captured as express_route hits the existing HTTP_CLIENT_RECEIVERS.has('request') check at parse-worker.ts:1369 and is skipped. No Route/HANDLES_ROUTE artifact created.


Findings

Finding 1 — Language-agnostic import regex scan (DoD §2.6 violation)

Risk: collectRequestLikeImportBindings(file.content) is called at parse-worker.ts:1099 and call-processor.ts:3542 for every parseable language — including Java, Python, Go, Rust, Ruby, PHP, C#, etc. The regex /import\s+([\s\S]*?)\s+from\s+['"]([^'"]+)['"]/g is ES-module-specific syntax that will never match in Java/Python/etc. files. For large repos where non-JS/TS files dominate, this is unnecessary regex scanning on every file content.

Evidence found: parse-worker.ts:1098-1099getProvider(language) is called to get the provider for any language, then immediately collectRequestLikeImportBindings(file.content) is called unconditionally. No language check precedes it. The existing code at parse-worker.ts:2036 already patterns-gates some JS/TS-only logic: if (language === SupportedLanguages.TypeScript || language === SupportedLanguages.JavaScript). The import scan should follow the same pattern.

Recommended fix: Gate the call to JS/TS/Vue-like languages, e.g.:

const JS_TS_LANGS = new Set([SupportedLanguages.TypeScript, SupportedLanguages.JavaScript, SupportedLanguages.Vue]);
const requestLikeBindings = JS_TS_LANGS.has(language)
  ? collectRequestLikeImportBindings(file.content)
  : new Set<string>();

Or move the empty-set result upstream before the query loop. A similar gate should be applied in call-processor.ts:3542.

Blocks merge: maybe — DoD §2.6 is explicit: "No repeated avoidable work, unnecessary scans." For large repos with mostly Java/Go/Python files this adds measurable overhead. Not catastrophic (fast regex no-ops), but a clear DoD violation. Fix this →


Finding 2 — HTTP_CLIENT_ONLY Set instantiated inside the match loop

Risk: call-processor.ts:3560 creates new Set(['head', 'options', 'request', 'ajax']) on every iteration of the matches loop. In a file with many HTTP client captures, this creates a new Set per match.

Evidence found: call-processor.ts:3560 — the Set is declared inside the for (const match of matches) loop body, inside the else if (captureMap['http_client'] && captureMap['http_client.url']) branch. The same method set already exists at module scope in parse-worker.ts:910 as HTTP_CLIENT_ONLY_METHODS.

Recommended fix: Move the declaration outside the file loop, or reuse the exported REQUEST_LIKE_MEMBER_METHODS from request-like-clients.ts directly. The parity with parse-worker's module-scoped HTTP_CLIENT_ONLY_METHODS constant is broken by this placement. Fix this →

Blocks merge: no — minor inefficiency only.


Finding 3 — nodeTextStartsWithPath backtick check is dead code

Risk: request-like-clients.ts:87 checks node.text.startsWith('/'). The http_client.urltree-sitter capture uses(string (string_fragment) @http_client.url)— confirmed attree-sitter-queries.ts:255. The string_fragmentnode type contains the raw text content **without** surrounding quote characters. Sonode.textfor'/api/items'is/api/items, and for `` /api/items`` the template literal is not captured by this query at all (only(string ...)patterns are captured, not(template_string)`). The backtick check can never be true.

Evidence found: tree-sitter-queries.ts:250-255 — TypeScript http_client query only captures (string (string_fragment) @http_client.url). request-like-clients.ts:87node.text.startsWith('/')`.

Recommended fix: Remove the backtick branch. If template literal URL support is desired, it requires a separate tree-sitter capture and extraction path (e.g., mirroring route.template_url in the route.fetch query at tree-sitter-queries.ts:241-242). Fix this →

Blocks merge: no — dead code, no production impact. But it signals incomplete validation against actual tree-sitter capture types.


Finding 4 — express_route fallback in getRequestLikeMemberCallUrl is dead code in current calling context

Risk: request-like-clients.ts:94-97getRequestLikeMemberCallUrl reads captureMap['http_client'] ?? captureMap['express_route'] for the call node, and captureMap['http_client.url'] ?? captureMap['express_route.path'] for the URL node. The function is called only from inside if (captureMap['http_client'] && captureMap['http_client.url']) blocks in both call-processor.ts:3557 and parse-worker.ts:1304. These guards guarantee captureMap['http_client'] and captureMap['http_client.url'] are always truthy when the function is called. The express_route fallbacks are never reached.

Evidence found: call-processor.ts:3557-3561 and parse-worker.ts:1304-1307 — both check captureMap['http_client'] && before calling getRequestLikeMemberCallUrl.

Recommended fix: Remove the ?? captureMap['express_route'] and ?? captureMap['express_route.path'] fallbacks, or document why they exist (e.g., if the function is intended for future multi-capture use). Fix this →

Blocks merge: no — dead code only.


Finding 5 — Method coverage gap in integration test (DoD §2.7)

Risk: REQUEST_LIKE_MEMBER_METHODS includes 9 methods: get, post, put, patch, delete, head, options, request, ajax. The fixture client.ts only has request.get('/api/items') and request.post('/api/client-only'). The integration test at express-routes.test.ts:60-67 only asserts that request.get('/api/items') creates a FETCHES edge. There is no assertion that request.post, request.put, request.patch, or request.delete create consumer edges. The "does not create provider route artifacts" test checks routes.not.toContain('/api/client-only') — this is a negative assertion that passes even if the code fails to emit the ExtractedFetchCall for POST.

Evidence found: express-routes.test.ts:60-67 — only GET is tested as a FETCHES source. request-like-clients.ts:5-15 — 9 methods in the set. client.ts:7-9request.post('/api/client-only') exists but is not asserted as producing a FETCHES edge. DoD §2.7: "Fixtures are realistic enough for the risk of the change."

Recommended fix: Either (a) add a test asserting request.post produces a FETCHES edge (which requires a matching Route in the fixture, e.g., add a POST endpoint to server.ts), or (b) document the accepted scope as GET-only and limit REQUEST_LIKE_MEMBER_METHODS accordingly. At minimum the test names should reflect what is and is not covered.

Blocks merge: maybe — DoD §2.7 requires tests that "cover the real changed path" and would "fail if behavior, wiring, or contracts were broken." A test covering only GET means a regression in POST/PUT/etc. FETCHES creation would go undetected.


Finding 6 — CLI E2E timeout change is unrelated to the fix (branch hygiene)

Risk: cli-e2e.test.ts:319 changes runCli('analyze', MINI_REPO, 30000) to 60000. This change appears in commit 6656f35 ("fix(ingestion): classify request-like client calls") and is not mentioned in the PR description.

Evidence found: cli-e2e.test.ts:311-319 — the inline comment does justify the change (vitest test-level timeout must exceed subprocess timeout to allow the slow-CI tolerance branch to fire). The justification is technically sound. However, it is scope-creep relative to the PR title and should appear in the PR description.

Recommended fix: Add a sentence to the PR description explaining why the timeout change is included. If it cannot be tied to this PR's ingestion fix, split to a separate commit or PR.

Blocks merge: no — justified by the inline comment.


Finding 7 — u2014 encoding in parse-worker.ts (pre-existing, confirmed not introduced by this PR)

Risk: parse-worker.ts:912 and parse-worker.ts:1370 contain the literal text u2014 instead of the em-dash character . These are in comments only.

Evidence found: git diff HEAD~3..HEAD -- gitnexus/src/core/ingestion/workers/parse-worker.ts — the PR's diff does not touch lines 912 or 1370. Pre-existing issue. Not introduced by this PR.

Recommended fix: Out of scope for this PR. Note for a follow-up.

Blocks merge: no


PR-specific assessment sections

A. JS/TS request-like binding detection

collectRequestLikeImportBindings is correct for ES module default imports (import request from './request'), namespace imports (import * as request from 'umi-request'), named imports with aliases (import { extend as request } from 'umi-request'), and combined default+named (import request, { get } from './request'). The type-item replace(/^type\s+/, '') stripping handles TS named type imports correctly. The regex uses [\s\S]*? (non-greedy multiline) which handles multi-line import clauses safely. One unverified risk: import type request from './request' (type-only default import of a non-value) would still add request to bindings, potentially creating false FETCHES for type-only imports. This is an edge case unlikely to appear in real HTTP client code.

isRequestLikeImportSource correctly matches umi-request exactly, and any local path whose basename (after stripping extension) is request. The basename approach is intentionally broad — any ./request, ../utils/request, ../../api/request.ts would match. This is an accepted design tradeoff (documented as "local request helpers") but could false-positive on non-HTTP modules named request (e.g., an Express req wrapper, a request-queue module). No test covers this case.

B. Member-call receiver and URL extraction

getMemberCallReceiver at request-like-clients.ts:81-84 navigates one level: callNode → childForFieldName('function') → childForFieldName('object'). This correctly handles request.get(...) but not chains like this.request.get(...) or getRequest().get(...). The PR scope explicitly targets direct request.get calls; chains are out of scope. This is acceptable given the stated fix.

URL handling: http_client.url captures string_fragment nodes (content without quotes), so node.text is already the clean path string. nodeTextStartsWithPath correctly tests startsWith('/') for absolute paths. The backtick check at line 87 is dead (see Finding 3). Template literal URLs are not captured by the current query — a limitation not documented in the helper.

Line numbers: captureMap['http_client'].startPosition.row is used in both paths. In parse-worker this includes + lineOffset for Vue SFC offsets; in call-processor the sequential fallback does not have a lineOffset (no Vue preprocessing in that path). This is pre-existing design, not introduced here.

C. Consumer/provider graph correctness

Confirmed:

  • request.get('/api/items')FETCHES edge to Route /api/items ✅ (tested)
  • request.post('/api/client-only') → no Route node, no HANDLES_ROUTE edge ✅ (tested)
  • Real router.get('/api/health') in server.tsRoute + HANDLES_ROUTE remain ✅ (existing tests pass)

Mechanism for route suppression: The existing express_route branch in parse-worker.ts:1324-1387 checks HTTP_CLIENT_RECEIVERS.has(receiverText) at line 1369, and request is explicitly in that set (line 915). This is the pre-existing guard from PR #780, not changed by this PR. This remains correct.

Not explicitly tested: that other methods (PUT, PATCH, DELETE) from request.X('/path') do not create HANDLES_ROUTE edges from client.ts — only the generic handlesEdges.find(e => e.sourceFilePath.includes('client.ts')) covers this broadly, which is acceptable.

D. Worker vs sequential fallback parity

Confirmed symmetric. Both paths:

  1. Call collectRequestLikeImportBindings(file.content) per file
  2. Call getRequestLikeMemberCallUrl(captureMap, requestLikeBindings) inside the http_client capture branch
  3. On match: push { filePath, fetchURL: requestLikeUrl, lineNumber: captureMap['http_client'].startPosition.row [+ lineOffset] }
  4. Fall through to HTTP_CLIENT_ONLY_METHODS check on non-match

The only asymmetry is the + lineOffset in parse-worker (for Vue SFC line adjustment), which is pre-existing and correct. No parity gap introduced.

E. Tests and fixture realism

The integration test at express-routes.test.ts:60-77 covers:

  • request.get('/api/items') creates FETCHES
  • /api/client-only has no Route node
  • ✅ No HANDLES_ROUTE edge from client.ts
  • request.post (or any other method) is not asserted to create FETCHES
  • ❌ No test for umi-request named/namespace import patterns
  • ❌ No test for a non-HTTP local request module (false-positive scenario)

The test would fail if the helper were removed from either path, because:

  • The GET method is NOT in HTTP_CLIENT_ONLY_METHODS (['head', 'options', 'request', 'ajax']), so without the new helper, request.get('/api/items') would silently produce no FETCHES edge. The test at line 66 (expect(clientFetch).toBeDefined()) would fail. Worker path coverage: confirmed via runPipelineFromRepo which uses the full pipeline.

However, the test only uses runPipelineFromRepo which runs sequentially (workers disabled in test environment). There is no test explicitly exercising the worker path in isolation. The worker and fallback code are symmetric (Finding D), which mitigates this, but the DoD §2.7 requirement for tests that "cover the real changed path" could be interpreted as requiring worker-enabled test coverage too.

F. Performance and large-repo behavior

Two concerns:

  1. Per-file regex scan for all languages (Finding 1): The regex /import\s+([\s\S]*?)\s+from\s+['"]([^'"]+)['"]/g does not match Java/Python/Go imports, so execution is fast (quick fail). For very large files (C++, Java), the scan still processes the entire content. In a 10k-file mixed-language repo, this adds unnecessary O(n) work per file.
  2. Set instantiation inside match loop (Finding 2): New Set on every match in call-processor's fallback path.

Neither is catastrophic, but Finding 1 is a clear DoD §2.6 violation.

G. CLI E2E timeout change

See Finding 6. The change is justified by the inline comment. Not a blocking concern, but should be mentioned in the PR description.

H. CI/check annotations

The CI report mentions 10 annotations including "ESLint --fix (removes unused imports)" and "Forbidden non-null assertion." These are reported in the PR Autofix check, not in the test results. The ESLint autofix annotations likely apply to files changed by the merge-from-main commits (Spring/Java extractor work in http-patterns/java.ts), not to the core fix files. The changed files in this PR's fix commit (request-like-clients.ts, call-processor.ts, parse-worker.ts) do not contain non-null assertions in the new code. Unverified: whether any annotation applies specifically to the new request-like-clients.ts — no ! operator appears in that file.

I. Hidden Unicode and hygiene

  • Bidi/zero-width scan: git diff HEAD~3..HEAD --name-only | xargs grep -P '[bidi range]'No bidi/zero-width characters found. Clean.
  • Non-ASCII in changed core files: request-like-clients.ts, call-processor.ts, parse-worker.ts — the only non-ASCII characters found in the parse-worker are pre-existing visible em-dashes (—) in comments (lines 9, 35, 42, 48, etc.) and the pre-existing u2014 literal text (Finding 7). None introduced by this PR's diff.
  • No commented-out code, TODO placeholders, or generated artifacts in the diff.
  • No dependency/package-lock changes.

Cross-PR/history alignment


Back-and-forth avoided by verifying

  • u2014 is pre-existing, not introduced by this PR — confirmed via git diff HEAD~3..HEAD
  • The http_client query captures string_fragment (no quotes) — confirmed via tree-sitter-queries.ts:255
  • Worker path uses + lineOffset for Vue SFC but the fallback path has no Vue preprocessing — confirmed as pre-existing design
  • processNextjsFetchRoutes requires a matching Route node, so request.post('/api/client-only') won't produce a spurious FETCHES edge even though it is extracted — confirmed via call-processor.ts:3460-3461
  • The CI test runPipelineFromRepo exercises the real pipeline (not a mock) — confirmed via express-routes.test.ts:14-16

Open questions

  1. PR feat(group): add rule-based HTTP mappings #1742 coordination: If both PRs land, there will be two independent request-like classifiers at different pipeline stages (ingestion fetch extraction vs. group HTTP mapping). Is the intended long-term design to consolidate into a shared helper, or are they intentionally separate because they serve different graph edges?
  2. umi-request test coverage: No integration test exercises an actual umi-request import (only local ./request). Should there be one, or is the regex path considered sufficiently unit-tested?

Final verdict

not production-ready

The core fix is correct: request.get('/api/items') from an imported ./request helper correctly produces a FETCHES edge in both the worker and sequential fallback paths. CI is green and the primary use case is exercised end-to-end. However, three DoD violations prevent production-ready classification:

  1. DoD §2.6 (call-processor.ts:3542, parse-worker.ts:1099): collectRequestLikeImportBindings runs for every parseable file in the repo regardless of language, adding unnecessary regex scanning on Java, Python, Go, Rust, etc. files. This is avoidable with a simple language guard.
  2. DoD §2.7: REQUEST_LIKE_MEMBER_METHODS contains 9 methods but only get is asserted as producing a FETCHES consumer edge. post, put, patch, delete have no FETCHES creation assertion. The PR description says the fix addresses request.get/post/put/delete/patch(...) — but the tests don't prove it for anything except get.
  3. DoD §2.3: Three items of dead code in request-like-clients.ts (backtick path in nodeTextStartsWithPath, express_route fallbacks in getRequestLikeMemberCallUrl, HTTP_CLIENT_ONLY Set in call-processor loop) signal that the implementation wasn't fully validated against actual tree-sitter capture types and calling contexts.

All three are fixable with targeted, small changes and do not require architectural rework.


@magyargergo
Copy link
Copy Markdown
Collaborator

@JaysonAlbert Could you please look into claude's findings ? 🙏

@JaysonAlbert
Copy link
Copy Markdown
Contributor Author

Thanks for the review. I pushed a follow-up commit (009f5aa9) addressing Claude’s findings:

  • gated request-like import scanning to JS/TS/Vue only, so non-JS languages no longer pay the unnecessary regex scan cost
  • moved the sequential-path HTTP_CLIENT_ONLY set out of the match loop
  • removed the dead backtick URL check and unused express_route fallbacks
  • added coverage for the language gate and for request-like post, put, patch, and delete producing FETCHES edges

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants