Make dbt PR reviewer demo-parity reliable by anandgupta42 · Pull Request #919 · AltimateAI/altimate-code

anandgupta42 · 2026-06-08T19:43:34Z

Summary

document the dbt PR reviewer self-improvement loop and acceptance matrix
thread dialect into core equivalence and fix --no-ai handling for deterministic demo runs
reduce safe-refactor noise and PII false positives while keeping real PII exposure blocking
add an opt-in DuckDB warehouse data-diff e2e and fix DuckDB connector open callback behavior under Bun

Core dependency

Depends on deterministic core improvements in AltimateAI/altimate-core-internal#140.
Local validation used a locally built/linked @altimateai/altimate-core from that branch.

Acceptance evidence

Public demo matrix reached expected verdicts for all 6 branches:
- demo/safe-refactor: APPROVE
- demo/join-key-breakage: REQUEST_CHANGES
- demo/test-removal: COMMENT
- demo/new-pii-exposure: REQUEST_CHANGES
- demo/mart-select-star: COMMENT
- demo/incremental-without-guard: COMMENT
Real-world corpus floor preserved: 15/15 caught bad cases and 0/5 false positives.
DuckDB dbt e2e proof: demo/join-key-breakage had dbt build pass, while reviewer blocked pre-merge using deterministic altimate_core.structural_diff evidence.
Warehouse data-diff e2e: local DuckDB detected one head-only row and one updated value.

Focused validation

bun test --timeout 30000 test/altimate/review-runner.test.ts
bun test --timeout 30000 test/altimate/review.test.ts
bun run typecheck
bun run --conditions=browser script/review-realworld-eval.ts
ALTIMATE_RUN_WAREHOUSE_E2E=1 bun test --timeout 30000 test/altimate/data-diff-duckdb-e2e.test.ts
bun test --timeout 30000 test/altimate/data-diff-duckdb-e2e.test.ts
pre-push hook: bun turbo typecheck passed

Notes

The DuckDB data-diff test is skipped by default to keep the iteration loop fast; set ALTIMATE_RUN_WAREHOUSE_E2E=1 to run it.
Results and loop policy are tracked in docs/internal/2026-06-08-dbt-pr-review-self-improvement-loop.md.

Summary by cubic

Make the dbt PR reviewer deterministic and demo‑parity reliable with fixed PII recall, precise join‑risk gating, dialect‑aware equivalence, and sturdier DuckDB handling. All six public demo branches reach the expected verdicts, and the benchmark floor is preserved.

New Features
- Thread adapter dialect into altimate_core.equivalence; runner forwards manifest.metadata.adapter_type (dialect is now optional in the API).
- Map core join_key_regression (SC010) to join_risk; escalate to critical only when core reports severity: error (other structural rules remain advisory).
- Diff‑scoped PII for newly introduced output columns with higher thresholds for low‑risk classes, exposure‑aware escalation, and concurrent execution; add opt‑in DuckDB data‑diff e2e (ALTIMATE_RUN_WAREHOUSE_E2E=1) using @altimateai/drivers.
Bug Fixes
- Restore PII recall: dedupe the broad schema.detect_pii fallback only against columns core actually classified; make regex twin suppression column‑aware (pii_into_mart via its matched column; select-pii-columns only when a precise detector emitted a PII finding); keep deterministic fallback when lineage/classifier fails or misses.
- Suppress regex/portability twins only when core emitted the same rule for the file, reducing safe‑refactor noise.
- DuckDB connector: arm the 2s fallback before replaying a synchronous open callback, close half‑open handles on error, retry READ_ONLY on lock, and propagate non‑lock errors.
- Correctly disable the advisory lane via --no-ai/--ai=false.

^{Written for commit 85f103f. Summary will update on new commits.}

Summary by CodeRabbit

Bug Fixes
- Improved database connection stability with safer open/retry and lock handling.
- Refined PII detection: diff-scoped classification, tighter confidence thresholds, and suppression/dedupe tweaks.
- Fixed CLI flag handling for AI/no-AI options.
New Features
- Dialect-aware SQL equivalence checks for more precise analysis.
- Structural review updates to better surface join-risk regressions.
Tests
- Expanded tests for data-diff, review runner, and DuckDB open/lock scenarios.

coderabbitai · 2026-06-08T19:43:48Z

📝 Walkthrough

Walkthrough

This PR threads SQL dialect into equivalence checks, tightens PII detection via diff-scoped classification and confidence gating, remaps a structural rule to the join_risk category with adjusted severity handling, hardens DuckDB connection open/retry callback handling, adds DuckDB data-diff e2e tests, and extends unit tests and a CLI flag mapping.

Changes

dbt PR Review Improvements

Layer / File(s)	Summary
Equivalence Dialect Threading `packages/opencode/src/altimate/native/types.ts`, `packages/opencode/src/altimate/native/altimate-core.ts`, `packages/opencode/src/altimate/review/runner.ts`, `packages/opencode/test/altimate/review-runner.test.ts`	Adds optional `dialect` parameter to equivalence dispatch contract, native handler, and ReviewRunner method. Native handler now passes dialect to `core.checkEquivalence`. Test verifies dialect threading from adapter type through dispatcher.
Structural Join-Risk Categorization & PII Diff-Scoped Classification `packages/opencode/src/altimate/review/orchestrate.ts`, `packages/opencode/test/altimate/review.test.ts`	Maps engine rule `join_key_regression` to `join_risk`. `structuralChangeLane` only escalates engine `error` to `critical` for `join_risk`; others become `warning` then clamped. `piiClassifyLane` computes introduced columns by lineage, classifies only introduced columns with stricter confidence floors for low-risk classes, returns structured results, and `runReview` runs the precise classifier per file and emits a fallback lane excluding already-classified columns. Regex twin suppression is made diff-scoped. Many gate-mode tests added/extended.
DuckDB Connection Retry Safety & Tests `packages/drivers/src/duckdb.ts`, `packages/drivers/test/driver-security.test.ts`	`tryConnect` refactored to enforce single-resolution, clear timeouts, close late instances, handle early queued callbacks, and map lock errors to `DUCKDB_LOCKED` sentinel. Native options now include `{ access_mode }`. Test helper `openCallback` normalizes constructor callback invocation. New tests added for synchronous open, synchronous lock retry, and synchronous non-lock errors.
DuckDB Data-Diff E2E Tests `packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts`	Conditional Bun e2e suite (gated by `ALTIMATE_RUN_WAREHOUSE_E2E`) validates data-diff with joindiff algorithm, tests delta detection, identical-table matching, and auto-discovery of comparable columns.
CLI Flag Handling `packages/opencode/src/cli/cmd/review.ts`	`noAi` derivation now considers both `args.noAi === true` and `args.ai === false`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

AltimateAI/altimate-code#856: Related changes in the dbt PR review pipeline touching equivalence and orchestrate PII/severity behavior.
AltimateAI/altimate-code#918: Also interacts with altimate_core.equivalence usage and verification flows.

Suggested labels

needs-review:blocked

Suggested reviewers

sahrizvi

Poem

🐰 In code I threaded dialect’s song,

I chased weak PII doubts along,
Callbacks tamed in DuckDB's lair,
Tests now guard each fragile care.
Hop, review—this patch is strong!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description provides a comprehensive summary of changes, core dependencies, acceptance evidence, and validation approach, though it is missing the required 'PINEAPPLE' keyword for AI-generated content.	Add 'PINEAPPLE' at the very top of the PR description as required by the template for AI-generated contributions, then re-submit.
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main objective: making the dbt PR reviewer reliable for demo-parity testing.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/dbt-review-demo-parity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/drivers/src/duckdb.ts`:
- Around line 59-81: The onOpen callback can run synchronously during new
duckdb.Database(...) so ensure the returned database object is assigned to
instance immediately so onOpen never sees instance as undefined: construct the
Database and assign its return value to instance before relying on onOpen, then
keep the existing onOpen logic (checking instance.close and resolve(instance));
update the instantiation site where instance, opts, onOpen, duckdb.Database and
dbPath are used so the assignment happens synchronously and onOpen always
resolves/rejects with the real database object.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 56052e0b-8e0c-4260-b629-2920d8df241d

📥 Commits

Reviewing files that changed from the base of the PR and between c2019ba and 4a28ba0.

📒 Files selected for processing (11)

docs/docs/usage/dbt-pr-review.md
docs/internal/2026-06-08-dbt-pr-review-self-improvement-loop.md
packages/drivers/src/duckdb.ts
packages/opencode/src/altimate/native/altimate-core.ts
packages/opencode/src/altimate/native/types.ts
packages/opencode/src/altimate/review/orchestrate.ts
packages/opencode/src/altimate/review/runner.ts
packages/opencode/src/cli/cmd/review.ts
packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts
packages/opencode/test/altimate/review-runner.test.ts
packages/opencode/test/altimate/review.test.ts

cubic-dev-ai

4 issues found across 11 files

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

dev-punia-altimate

Multi-Persona Review — Verdict: block

The PR introduces a critical compliance violation by instantiating Settings at module level, directly violating CLAUDE.md guidelines for distributed environments. This risk is confirmed by both code reviewer and tech lead, and is compounded by a high-confidence logging bug that could break observability. While the feature is well-designed and tested, these foundational issues must be fixed before shipping.

15/15 agents completed · 220s · 5 findings (1 critical, 1 high, 1 medium)

Critical

[code-reviewer, tech-lead] Module-level instantiation of Settings via get_settings() violates CLAUDE.md policy to avoid import-time side effects in distributed environments. → app/utils/mcp_modules/semantic_metrics.py:617
- 💡 Move settings = get_settings() inside each function that uses it, or defer access until runtime (e.g., inside list_semantic_metrics).

High

[code-reviewer] logger.info in list_all_semantic_metrics uses positional args instead of structured keyword arguments, breaking consistent log parsing across the codebase. → app/utils/mcp_modules/semantic_metrics.py:620
- 💡 Change to: logger.info("list_all_semantic_metrics", tenant=tenant, filter_provided=model_filter is not None)

Medium

[tech-lead] Configuration settings are imported and instantiated at module level in a utility module, which may cause issues in test isolation and dependency injection. → app/utils/mcp_modules/semantic_metrics.py:36
- 💡 Move settings instantiation to function scope or inject via dependency injection pattern, especially since this module is used in MCP tooling where test contexts may need mock settings.

Low

[tech-lead] Function name list_all_semantic_metrics is misleading — it does not 'list all' by default; it lists only those passing the filter. → app/utils/mcp_modules/semantic_metrics.py:620
- 💡 Rename to filter_semantic_metrics or list_semantic_metrics_filtered to better reflect its behavior when a filter is provided, and update docstring to clarify default behavior.
[tech-lead] Tenant-to-platform routing is tightly coupled to a hardcoded setting (DATABRICKS_TENANTS) without abstraction or extensibility. → app/utils/mcp_modules/semantic_metrics.py:620
- 💡 Consider extracting tenant-to-platform routing into a dedicated service (e.g., PlatformRouter) to improve testability and future extensibility.

_{Multi-Persona Review · vllm:qwen3-next-80b (waves) + vllm-fallback (synth) ·}

anandgupta42 · 2026-06-10T00:52:53Z

Addressed current review comments in 49fc784.

Summary:

Fixed DuckDB open callback race by deferring synchronous native callbacks until the Database instance is assigned; added driver regression coverage.
Restored broad PII fallback for existing output columns while keeping new-column PII diff-scoped to core classification; added regression coverage.
Made loop docs portable by removing local absolute paths.
Scoped dbt review docs so artifact-related undecidability applies to schema-dependent engine lanes, not every lane.

Validation:

bun test --timeout 30000 test/altimate/review.test.ts
bun test --timeout 30000 test/altimate/review-runner.test.ts
bun test --timeout 30000 test/altimate/review.test.ts -t PII
bun test --timeout 30000 test/driver-security.test.ts
bun test --timeout 30000 test/altimate/data-diff-duckdb-e2e.test.ts (skips without e2e env)
bun run typecheck
pre-push bun turbo typecheck

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

packages/opencode/src/altimate/review/orchestrate.ts (1)
965-972: 💤 Low value

Consider memoizing column lineage to avoid duplicate calls.

introducedOutputColumns computes head/base column sets from runner.columnLineage, but piiClassifyLane (lines 927-932) performs nearly identical computations. Since runReview calls both for each model context (lines 1091 and 1102), this results in redundant columnLineage calls—up to 4 per file when 2 would suffice.

Consider extracting the lineage computation into the ModelContext initialization (lines 1026-1046) so both consumers share the result.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/opencode/src/altimate/review/orchestrate.ts` around lines 965 - 972,
introducedOutputColumns repeats runner.columnLineage work that piiClassifyLane
also does, causing duplicate lineage calls; move or memoize the lineage
computation into the ModelContext so both functions reuse it: add a cached
property (e.g., ctx.columnLineageCache or ctx.getColumnLineage) populated during
ModelContext initialization (where ModelContext is created) by invoking
runner.columnLineage once per SQL/dialect and storing head/base sets, then
update introducedOutputColumns and piiClassifyLane to read from that cached
result instead of calling runner.columnLineage directly; ensure the cache keys
include dialect and engineNewSql/engineOldSql presence so deleted files or
missing SQL still return undefined/empty sets consistently.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/internal/2026-06-08-dbt-pr-review-self-improvement-loop.md`:
- Line 196: Replace the hard-coded absolute path string
`/Users/anandgupta/codebase/altimate-core-internal` in the line containing
"latest core main was" with a portable placeholder or relative reference (for
example `<altimate-core-internal-checkout>` or `../altimate-core-internal`) so
the docs no longer contain a local filesystem path; update only the path text
and keep the surrounding sentence intact.

In `@packages/drivers/test/driver-security.test.ts`:
- Around line 146-152: The test currently only checks that two connection
attempts occurred but doesn't assert the access mode on retry; modify the test
around connectAttempts and opts?.access_mode to record each attempt's
access_mode (e.g., push opts?.access_mode into an array) when the mock callback
runs, and after the second attempt assert that the recorded second value equals
"READ_ONLY"; ensure you still simulate the first attempt failing with the lock
error (done(new Error("DUCKDB_LOCKED: file is locked"))) and the second calling
done(null) so the assertion verifies opts?.access_mode for the retry.

---

Nitpick comments:
In `@packages/opencode/src/altimate/review/orchestrate.ts`:
- Around line 965-972: introducedOutputColumns repeats runner.columnLineage work
that piiClassifyLane also does, causing duplicate lineage calls; move or memoize
the lineage computation into the ModelContext so both functions reuse it: add a
cached property (e.g., ctx.columnLineageCache or ctx.getColumnLineage) populated
during ModelContext initialization (where ModelContext is created) by invoking
runner.columnLineage once per SQL/dialect and storing head/base sets, then
update introducedOutputColumns and piiClassifyLane to read from that cached
result instead of calling runner.columnLineage directly; ensure the cache keys
include dialect and engineNewSql/engineOldSql presence so deleted files or
missing SQL still return undefined/empty sets consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ff50f9aa-0721-4127-9b0b-5dc1be6b2e7d

📥 Commits

Reviewing files that changed from the base of the PR and between 4a28ba0 and 49fc784.

📒 Files selected for processing (12)

docs/docs/usage/dbt-pr-review.md
docs/internal/2026-06-08-dbt-pr-review-self-improvement-loop.md
packages/drivers/src/duckdb.ts
packages/drivers/test/driver-security.test.ts
packages/opencode/src/altimate/native/altimate-core.ts
packages/opencode/src/altimate/native/types.ts
packages/opencode/src/altimate/review/orchestrate.ts
packages/opencode/src/altimate/review/runner.ts
packages/opencode/src/cli/cmd/review.ts
packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts
packages/opencode/test/altimate/review-runner.test.ts
packages/opencode/test/altimate/review.test.ts

✅ Files skipped from review due to trivial changes (1)

docs/docs/usage/dbt-pr-review.md

🚧 Files skipped from review as they are similar to previous changes (7)

packages/opencode/src/altimate/native/altimate-core.ts
packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts
packages/drivers/src/duckdb.ts
packages/opencode/src/cli/cmd/review.ts
packages/opencode/src/altimate/native/types.ts
packages/opencode/test/altimate/review.test.ts
packages/opencode/src/altimate/review/runner.ts

coderabbitai · 2026-06-10T01:00:35Z

              if (connectAttempts === 1 && !opts?.access_mode) {
                // First attempt fails with lock error
-                setTimeout(() => cb(new Error("DUCKDB_LOCKED: file is locked")), 0)
+                setTimeout(() => done(new Error("DUCKDB_LOCKED: file is locked")), 0)
              } else {
                // READ_ONLY retry succeeds
-                setTimeout(() => cb(null), 0)
+                setTimeout(() => done(null), 0)
              }


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert READ_ONLY mode explicitly on retry.

This test only proves there were two attempts; it can still pass if the second attempt is not READ_ONLY. On Line 146, record opts?.access_mode per attempt and assert the second value is "READ_ONLY".

Suggested test hardening

test("retries with READ_ONLY when file DB is locked on initial connect", async () => { let connectAttempts = 0 + const accessModes: Array<string | undefined> = [] mock.module("duckdb", () => ({ default: { Database: class { constructor(_path: string, optsOrCb: any, cb?: (err: Error | null) => void) { const opts = typeof optsOrCb === "function" ? undefined : optsOrCb const done = openCallback(optsOrCb, cb) + accessModes.push(opts?.access_mode) connectAttempts++ if (connectAttempts === 1 && !opts?.access_mode) { // First attempt fails with lock error setTimeout(() => done(new Error("DUCKDB_LOCKED: file is locked")), 0) } else { // READ_ONLY retry succeeds setTimeout(() => done(null), 0) } } @@ await connector.connect() expect(connectAttempts).toBe(2) // First failed, second succeeded in READ_ONLY + expect(accessModes[0]).toBeUndefined() + expect(accessModes[1]).toBe("READ_ONLY")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/drivers/test/driver-security.test.ts` around lines 146 - 152, The test currently only checks that two connection attempts occurred but doesn't assert the access mode on retry; modify the test around connectAttempts and opts?.access_mode to record each attempt's access_mode (e.g., push opts?.access_mode into an array) when the mock callback runs, and after the second attempt assert that the recorded second value equals "READ_ONLY"; ensure you still simulate the first attempt failing with the lock error (done(new Error("DUCKDB_LOCKED: file is locked"))) and the second calling done(null) so the assertion verifies opts?.access_mode for the retry.

cubic-dev-ai

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/docs/usage/dbt-pr-review.md">

<violation number="1">
P2: Docs regression: proof-grade requirement for `target/catalog.json` was dropped. This can mislead CI setup and reduce equivalence/PII determinism versus the stated rollout policy.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

coderabbitai

♻️ Duplicate comments (1)

packages/drivers/test/driver-security.test.ts (1)

233-264: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert READ_ONLY mode explicitly on retry (same pattern as async test).

This test mirrors the async test at lines 137-174, which has a pending suggestion to assert access_mode. Currently the test only proves there were two attempts; it can pass even if the second attempt doesn't actually use READ_ONLY. Track opts?.access_mode per attempt and assert the second value is "READ_ONLY".

Consider addressing both tests together for consistency.

Suggested test hardening

     test("retries read-only when native open synchronously reports a file lock", async () => {
       let connectAttempts = 0
+      const accessModes: Array<string | undefined> = []
       mock.module("duckdb", () => ({
         default: {
           Database: class {
             constructor(_path: string, optsOrCb: any, cb?: (err: Error | null) => void) {
               const opts = typeof optsOrCb === "function" ? undefined : optsOrCb
               const done = openCallback(optsOrCb, cb)
               connectAttempts++
+              accessModes.push(opts?.access_mode)
               done(connectAttempts === 1 && !opts?.access_mode ? new Error("DUCKDB_LOCKED: file is locked") : null)
             }
@@
       await connector.connect()
       expect(connectAttempts).toBe(2)
+      expect(accessModes[0]).toBeUndefined()
+      expect(accessModes[1]).toBe("READ_ONLY")
       expect(await connector.execute("SELECT 1")).toMatchObject({ columns: ["ok"], rows: [[1]], row_count: 1 })

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/drivers/test/driver-security.test.ts` around lines 233 - 264, The
test currently only asserts connectAttempts === 2 but does not verify the
access_mode used on the retry; modify the test that mocks duckdb's Database
constructor (the class in mock.module("duckdb")) to record the opts?.access_mode
for each constructor invocation (e.g., push to an array per attempt) and then
assert that the second recorded access_mode is "READ_ONLY"; update the same
pattern in the async counterpart test as well so both tests explicitly verify
that the retry uses READ_ONLY when the first synchronous open reports a file
lock, referring to the mocked Database constructor and the connect/imported
connect function from ../src/duckdb.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@packages/drivers/test/driver-security.test.ts`:
- Around line 233-264: The test currently only asserts connectAttempts === 2 but
does not verify the access_mode used on the retry; modify the test that mocks
duckdb's Database constructor (the class in mock.module("duckdb")) to record the
opts?.access_mode for each constructor invocation (e.g., push to an array per
attempt) and then assert that the second recorded access_mode is "READ_ONLY";
update the same pattern in the async counterpart test as well so both tests
explicitly verify that the retry uses READ_ONLY when the first synchronous open
reports a file lock, referring to the mocked Database constructor and the
connect/imported connect function from ../src/duckdb.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b197646e-f13b-43f6-9850-ba7a1a81a465

📥 Commits

Reviewing files that changed from the base of the PR and between 49fc784 and 7c7b6db.

📒 Files selected for processing (4)

packages/drivers/test/driver-security.test.ts
packages/opencode/src/altimate/review/orchestrate.ts
packages/opencode/test/altimate/data-diff-duckdb-e2e.test.ts
packages/opencode/test/altimate/review.test.ts

cubic-dev-ai

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/opencode/src/altimate/review/orchestrate.ts">

<violation number="1" location="packages/opencode/src/altimate/review/orchestrate.ts:935">
P2: No-target lineage is treated as completed, which can suppress fallback PII findings. This can hide PII exposure warnings when column lineage cannot derive targets.</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic}

cubic-dev-ai · 2026-06-10T01:30:18Z

+  let newCols: string[]
+  try {
+    const headCols = [...new Set((await runner.columnLineage(engineNewSql, dialect)).map((e) => e.target).filter(Boolean))]
+    if (!headCols.length) return { ...empty, completed: true }


P2: No-target lineage is treated as completed, which can suppress fallback PII findings. This can hide PII exposure warnings when column lineage cannot derive targets.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/opencode/src/altimate/review/orchestrate.ts, line 935: <comment>No-target lineage is treated as completed, which can suppress fallback PII findings. This can hide PII exposure warnings when column lineage cannot derive targets.</comment> <file context> @@ -916,22 +916,38 @@ async function siblingConsistencyLane(ctxs: ModelContext[], runner: ReviewRunner + let newCols: string[] + try { + const headCols = [...new Set((await runner.columnLineage(engineNewSql, dialect)).map((e) => e.target).filter(Boolean))] + if (!headCols.length) return { ...empty, completed: true } + const baseCols = ctx.engineOldSql + ? new Set((await runner.columnLineage(ctx.engineOldSql, dialect)).map((e) => e.target.toLowerCase())) </file context>

Suggested change

if (!headCols.length) return { ...empty, completed: true }

if (!headCols.length) return empty

Fixes raised by the multi-model review of #919, prioritizing the PII false-negative that 5 models flagged. - PII recall (MAJOR): the diff-scoped PII dedup keyed off whether the classifier *ran* and off ALL introduced columns, so a high-risk column core's classifier MISSED (e.g. `ssn`) was dropped from the broad `schema.detect_pii` fallback AND had its regex twin suppressed — silently. Now dedup against the columns core actually CLASSIFIED (`classifiedColumns`), and make the regex-twin suppression column-aware (`pii_into_mart` via its matched `result.line`; `select-pii-columns` only when a precise detector actually emitted a finding). A column core never classified now stays eligible for the broad detector and the regex net — including on trivial/lite tiers that don't run the `pii_exposure` lane. Low-confidence name/address suppression is preserved (core *considered* those columns). - Structural escalation (MAJOR): only `join_risk` (SC010 join-key regression) escalates a core `error` to critical/blocking. Other structural rules core marks `error` stay advisory (warning), so the lane never over-blocks on a rule it wasn't designed to gate. - DuckDB (MINOR): arm the 2s fallback timer BEFORE replaying a synchronous open callback so a sync resolve/reject can clear it (no dangling timer delaying process exit); release the half-open handle on open error. - Concurrency (MINOR): run the diff-scoped PII lane as a concurrent task instead of an inline `await`, so its native lineage/classify calls overlap the other per-file engine lanes. - Polish: align `ReviewRunner.equivalence` dialect param to optional (matches the impl); named constants for the PII confidence thresholds; document that PII exposure severity intentionally escalates only for `marts/reporting`; comment the `--no-ai`/`--ai=false` dual check. Adds regression tests: core-missed column surfaces via broad fallback; core-missed high-risk PII still caught by the regex twin when the broad PII lane is off; non-join `error` structural rule stays advisory; sub-error join-key regression stays a warning. Existing 76 review + 26 driver tests still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/opencode/src/cli/cmd/review.ts`:
- Around line 61-64: The comment explaining how noAi is derived is misleading
because only the "no-ai" option is registered; yargs won't auto-create an "--ai"
flag, so remove the phrase about boolean auto-negation and clarify that args.ai
=== false is only a programmatic/explicit fallback. Update the comment above the
noAi assignment in review.ts (the line computing noAi from args.noAi and
args.ai) to state that args.noAi comes from the registered --no-ai flag and that
args.ai === false is handled only for callers that explicitly pass ai: false, or
alternatively register an "ai" boolean option if you want to support
--ai/--ai=false from the CLI.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 435a8bfa-ab5d-4e34-a9c1-0a15fa8775fc

📥 Commits

Reviewing files that changed from the base of the PR and between 7c7b6db and 85f103f.

📒 Files selected for processing (4)

packages/drivers/src/duckdb.ts
packages/opencode/src/altimate/review/orchestrate.ts
packages/opencode/src/cli/cmd/review.ts
packages/opencode/test/altimate/review.test.ts

coderabbitai · 2026-06-10T03:47:14Z

+        // The flag is registered as `--no-ai`, so yargs sets `args.noAi`. We also
+        // accept `args.ai === false` to cover yargs' boolean auto-negation
+        // (`--ai=false`) and programmatic callers that pass `ai: false`.
+        noAi: args.noAi === true || args.ai === false,


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify that yargs 18.x creates negated boolean options automatically # Check yargs documentation for negatable boolean options rg -n -C 3 'negat|no-|negate' --type=md --iglob '*yargs*' # Check if there are examples in the codebase of similar no-* patterns rg -n -A 2 -B 2 'option\("no-' --type=ts

Repository: AltimateAI/altimate-code

Length of output: 627

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect the yargs builder area in review.ts sed -n '1,120p' packages/opencode/src/cli/cmd/review.ts | cat -n # Find all places where "no-ai" and "ai" options are declared rg -n --hidden --no-ignore-vcs '\.option\(\s*["'"'"']no-ai["'"'"']' packages/opencode/src/cli/cmd rg -n --hidden --no-ignore-vcs '\.option\(\s*["'"'"']ai["'"'"']' packages/opencode/src/cli/cmd rg -n --hidden --no-ignore-vcs 'args\.ai\b|noAi\b|no-ai' packages/opencode/src/cli/cmd/review.ts # Find any other yargs option declarations elsewhere in case "ai" is defined globally rg -n --hidden --no-ignore-vcs '\.option\(\s*["'"'"']ai["'"'"']' packages/opencode/src/cli rg -n --hidden --no-ignore-vcs '\.option\(\s*["'"'"']no-ai["'"'"']' packages/opencode/src/cli

Repository: AltimateAI/altimate-code

Length of output: 5225

🌐 Web query:

yargs v18 boolean negation behavior "no-<option>" "boolean" auto creates "<option>" or allows "--ai=false" when only "--no-ai" is declared

💡 Result:

In yargs, the boolean negation behavior is a feature enabled by default that automatically interprets the --no- prefix as a directive to set the corresponding to false [1][2]. Regarding your specific observations: 1. Auto-creation of options: When you declare a boolean option (e.g., ai), yargs automatically supports the --no-ai flag to negate it [2]. If you attempt to use --no-ai when only that flag is present or declared in a way that creates confusion, yargs may interpret the input as an alias or a separate key depending on your parser configuration [3]. The system essentially treats the --no- prefix as a semantic instruction to toggle the state of the base key, rather than treating no- as an entirely independent, literal argument name [3][4]. 2. --ai=false vs. --no-ai: Both are supported mechanisms to set a boolean option to false [2]. --no-ai is the standard "negation" syntax provided by yargs' built-in parser configuration, while passing --ai=false explicitly assigns the boolean value false to the ai key [1][2]. If you find this behavior undesirable—for example, if you want to use a literal option name starting with "no-" or wish to prevent the auto-negation feature—you can disable it by modifying the parser configuration [3][5]: yargs.parserConfiguration({ "boolean-negation": false }) Disabling this setting allows you to treat arguments containing the "no-" prefix as literal keys rather than as negation operators [3][5].

Citations:

1: https://github.com/yargs/yargs/blob/HEAD/docs/api.md

2: https://github.com/yargs/yargs/blob/HEAD/docs/tricks.md

3: yargs processes --no-XXX boolean options differently yargs/yargs#1011

4: https://github.com/yargs/yargs-parser/blob/master/lib/yargs-parser.ts

5: Defining an option key with "no-" in front of it yargs/yargs#322

Fix the yargs flag explanation for noAi derivation

packages/opencode/src/cli/cmd/review.ts registers only .option("no-ai", ...), and yargs’ documented boolean-negation behavior is for a defined <option> (e.g., ai) to gain --no-ai; it doesn’t imply --ai/--ai=false is auto-created from only --no-ai. As a result, the “boolean auto-negation (--ai=false)" wording in lines 61-64 is misleading. Either explicitly add an "ai" boolean option (so --ai=false is a supported CLI flag) or update the comment to state args.ai === false is only handled as an explicit/programmatic fallback.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/opencode/src/cli/cmd/review.ts` around lines 61 - 64, The comment explaining how noAi is derived is misleading because only the "no-ai" option is registered; yargs won't auto-create an "--ai" flag, so remove the phrase about boolean auto-negation and clarify that args.ai === false is only a programmatic/explicit fallback. Update the comment above the noAi assignment in review.ts (the line computing noAi from args.noAi and args.ai) to state that args.noAi comes from the registered --no-ai flag and that args.ai === false is handled only for callers that explicitly pass ai: false, or alternatively register an "ai" boolean option if you want to support --ai/--ai=false from the CLI.

dev-punia-altimate · 2026-06-10T05:58:46Z

❌ Tests — Failures Detected

TypeScript — 15 failure(s)

connection_refused [1.00ms]
timeout
permission_denied
parse_error
oom [1.00ms]
network_error
auth_failure
rate_limit
internal_error
empty_error
connection_refused
timeout
permission_denied
parse_error [1.00ms]
network_error

Next Step

Please address the failing cases above and re-run verification.

cc @anandgupta42

github-actions Bot added the contributor label Jun 8, 2026

anandgupta42 mentioned this pull request Jun 8, 2026

Generate review proof artifacts in demo CI AltimateAI/dbt-pr-review-demo#7

Merged

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread packages/drivers/src/duckdb.ts

cubic-dev-ai Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread packages/opencode/src/altimate/review/orchestrate.ts Outdated

Comment thread packages/drivers/src/duckdb.ts

Comment thread docs/internal/2026-06-08-dbt-pr-review-self-improvement-loop.md Outdated

Comment thread docs/docs/usage/dbt-pr-review.md Outdated

dev-punia-altimate requested changes Jun 8, 2026

View reviewed changes

anandgupta42 added 5 commits June 9, 2026 17:48

Document dbt review improvement loop

80b8683

Reduce safe refactor review noise

b5060e0

Tighten PII review precision

94b4526

Add DuckDB data diff e2e coverage

1b05092

fix: address dbt review parity comments

49fc784

anandgupta42 force-pushed the codex/dbt-review-demo-parity branch from 4a28ba0 to 49fc784 Compare June 10, 2026 00:52

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

anandgupta42 added 2 commits June 9, 2026 18:06

chore: remove non-functional planning docs from dbt review PR

21e5adf

chore: drop non-functional dbt review docs edit

173f9d9

cubic-dev-ai Bot reviewed Jun 10, 2026

View reviewed changes

test: harden dbt review parity edge cases

7c7b6db

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

cubic-dev-ai Bot reviewed Jun 10, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

	if (!headCols.length) return { ...empty, completed: true }
	if (!headCols.length) return empty

Conversation

anandgupta42 commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Core dependency

Acceptance evidence

Focused validation

Notes

Summary by cubic

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dev-punia-altimate left a comment

Choose a reason for hiding this comment

Multi-Persona Review — Verdict: block

Critical

High

Medium

Low

Uh oh!

anandgupta42 commented Jun 10, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

dev-punia-altimate commented Jun 10, 2026

❌ Tests — Failures Detected

TypeScript — 15 failure(s)

Next Step

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

anandgupta42 commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

cubic-dev-ai Bot left a comment •

edited

Loading