feat(query-enhancements): PPL lint backend — feature flag + explain/calcite proxy routes by Hanyu-W · Pull Request #12255 · opensearch-project/OpenSearch-Dashboards

Hanyu-W · 2026-06-18T23:10:19Z

Description

Backend plumbing for the PPL linter, adding the feature flag and two read-only proxy routes.

Feature flag — queryEnhancements.pplLint capability, default false, resolved at runtime via a DynamicConfigService capability switcher (same pattern as explore).
POST /api/enhancements/ppl/explain — proxies a PPL query to /_plugins/_ppl/_explain and returns the Calcite execution plan. Validates a non-empty query; supports optional dataSourceId.
GET /api/enhancements/ppl/calcite_settings — reads /_cluster/settings (scoped via filter_path) and returns { calciteEnabled, allJoinTypesAllowed }. Fails open on errors so a settings-read failure never blocks the editor; logs 401/403 at warn.

Issues Resolved

Backend plumbing for opensearch-project/sql#5405

Screenshot

N/A — no UI changes.

Testing the changes

Run unit tests for the new routes and capability switcher:

node scripts/jest.js \
  src/plugins/query_enhancements/server/plugin.test.ts \
  src/plugins/query_enhancements/server/routes/ppl_calcite_settings.test.ts \
  src/plugins/query_enhancements/server/routes/ppl_explain.test.ts

Run the full plugin server suite to confirm no regressions:
```
node scripts/jest.js src/plugins/query_enhancements/server
```
Point the dev server at a live opensearch-sql cluster and hit GET /api/enhancements/ppl/calcite_settings — verify the scoped filter_path response matches the full unfiltered /_cluster/settings output for calciteEnabled and allJoinTypesAllowed.

Check List

All tests pass
- yarn test:jest
- yarn test:jest_integration
New functionality includes testing
New functionality has been documented
Commits are signed per the DCO using --signoff

…xy routes Add the disabled-by-default queryEnhancements.pplLint capability (DynamicConfigService switcher, mirrors agent_traces) and two read-only OpenSearch proxy routes (_ppl/_explain, _cluster/settings) that the PPL linter will consume. Feature is OFF and inert; no client wiring yet. Signed-off-by: Hanyu Wei <weihanyu@amazon.com>

ps48 · 2026-06-22T18:51:25Z

+
+        return res.ok({
+          body: {
+            calciteEnabled: resolveValue('plugins.calcite.enabled') !== 'false',


p0: backward-compat issue with calciteEnabled default

resolveValue(...) !== 'false' returns true when the key is absent (undefined !== 'false'). On an older opensearch-sql cluster that predates plugins.calcite.enabled, the key is genuinely absent and the settings read succeeds (so we are on this success path, not the catch block), yet we report calciteEnabled: true for a cluster that has no Calcite engine at all.

The request already sends include_defaults=true, so any cluster that knows the setting surfaces it in the defaults bucket even when unset. Absence therefore reliably indicates that the cluster does not have Calcite, and the safe interpretation is the opposite of the current default. Since calciteEnabled gates the explain-based lint rules, the current behavior makes the editor fire _explain calls on every keystroke against clusters that cannot support them.

Suggested fix is to invert the success-path default:

calciteEnabled: resolveValue('plugins.calcite.enabled') === 'true',

New clusters still resolve correctly (the default is surfaced); old clusters correctly resolve to false. The catch-block fail-open (true) is a separate decision and can stay as-is. Worth confirming the intended old-cluster behavior with the SQL team either way.

Good catch. Confirmed that plugins.calcite.enabled is registered (with default true) only when the SQL plugin loads. I inverted the check to === 'true', left the catch-block fail-open (true) intact and added a comment: a 200 with the key absent is definitive "no Calcite," whereas an error can't distinguish "no plugin" from "transient failure." Side effect: calciteEnabled now matches the === 'true' idiom already used by allJoinTypesAllowed.

ps48 · 2026-06-22T18:51:25Z

+      method: 'GET',
+      path: EXPECTED_PATH,
+    });
+    // Absent setting => calcite treated as enabled (not 'false'), join types not allowed.


p0: test locks in the absent-setting default

This test currently encodes the behavior flagged on ppl_calcite_settings.ts (// Absent setting => calcite treated as enabled, asserting calciteEnabled: true). If the success-path default is inverted, this should flip to expect false, and it would be worth renaming or adding a case explicitly framed as an older cluster without plugins.calcite.enabled resolving to calcite disabled, so the backward-compat intent is captured in the test name.

Done — flipped 'uses the core client …' to expect calciteEnabled: false and added an explicitly-named case ('reports calciteEnabled:false for a cluster missing plugins.calcite.enabled') so the backward-compat intent is self-documenting. The catch-path tests ('swallows transport errors', 'logs auth failures at warn') keep asserting the fail-open true, which documents the success-vs-error asymmetry side by side.

ps48 · 2026-06-22T18:51:25Z

+ * otherwise, or `null` when a dataSourceId is requested but the data source
+ * plugin is unavailable (the caller should respond 400 in that case).
+ */
+export async function resolveOpenSearchClient(


p1: add a direct multi-data-source test for resolveOpenSearchClient

This helper is the seam for the multi-data-source requirement in the parent RFC (sql#5405 section 2.9 plumbs dataSourceId end-to-end), but it is only tested indirectly through the two routes, and no test exercises two different dataSourceId values resolving to distinct clients. getClient is a single mock, so we currently prove that the id is forwarded, not that the right client comes back per id.

Suggested coverage in a dedicated unit test for this helper: ds-1 resolves to getClient('ds-1') and ds-2 resolves to getClient('ds-2') returning distinct clients, no id resolves to asCurrentUser, and an absent context.dataSource resolves to null. Cheap insurance given the rest of the lint feature builds on this.

Added a describe('resolveOpenSearchClient') to index.test.ts covering: two distinct dataSourceIds resolve to distinct clients (not just id-forwarding), no id resolves to asCurrentUser, and a dataSourceId with context.dataSource absent resolves to null.

ps48 · 2026-06-22T18:51:25Z

+    {
+      path: API.PPL_EXPLAIN,
+      validate: {
+        body: schema.object({ query: schema.string({ minLength: 1 }) }),


p1: confirm request guarding for a per-keystroke route

This proxies arbitrary PPL to _explain with only minLength: 1 on query. It is read-only and auth is enforced by the downstream client, so the risk is low, but per the RFC this route is hit on every keystroke (debounced). Worth confirming the debounce and abort logic lives client-side, and whether a minimal server-side guard (max query length) is wanted. Non-blocking, flagging for the follow-up that wires the editor.

Confirmed — the keystroke throttling lives client-side and lands with the editor-wiring follow-up PR, not this backend slice. Specifically: a 500 ms trailing-edge debounce per model, plus an ExplainCache that dedups in-flight requests and LRU-caches results per (dataSourceId, query), so repeated lint passes over the same text issue at most one _explain call. There's no explicit AbortController on the explain request — the debounce + dedup make a superseded response a harmless cache write, so cancellation wasn't needed (easy to add later if we want it). On the server side, I added the maxLength: 65536 guard on the query schema here in 4ac79ae as the minimal request bound you flagged, independent of the global server.maxPayload.

github-actions · 2026-06-22T19:22:55Z

✅ All unit and integration tests passing

🔗 Workflow run · commit 031498a11bb4f1aaf9f2453da37371604ce4a572

joshuali925 · 2026-06-22T19:23:13Z

  }),
+  // PPL linter feature flag, read at runtime via DynamicConfigService and
+  // surfaced as the queryEnhancements.pplLint capability. Disabled by default.
+  pplLint: schema.object({


can this be ppl.lint.enabled, or lintEnabled: ["ppl"]? i'm thinking if we add more languages in the future, can the config path be more structured?

Good call — restructured to queryEnhancements.ppl.lint.enabled in 031498a (config schema now nests ppl: { lint: { enabled } }) so that future languages/features extend the same shape.

joshuali925 · 2026-06-22T19:28:35Z

+
+        return res.ok({
+          body: {
+            calciteEnabled: resolveValue('plugins.calcite.enabled') !== 'false',


Respond to ps48's review on opensearch-project#12255: - calciteEnabled: invert `!== 'false'` to `=== 'true'` so a successful cluster-settings read with `plugins.calcite.enabled` absent reports the cluster as having no Calcite engine (disabled), instead of defaulting it on. include_defaults=true surfaces the key on any Calcite-capable cluster, so its absence is definitive. Document the deliberate asymmetry with the catch block, which keeps failing open (an error can't tell "no plugin" from a transient failure). - Flip the matching test assertion and add an explicitly-named case for a cluster missing the key (no/old SQL plugin) to pin the backward-compat contract. - Add a direct unit test block for resolveOpenSearchClient: distinct dataSourceIds resolve to distinct clients, no id resolves to asCurrentUser, and a dataSourceId with the data source plugin unavailable resolves to null. - explain route: add `maxLength: 65536` to the query schema. The body was already bounded by server.maxPayload (1 MiB); this makes the cap explicit and independent of global config. Signed-off-by: Hanyu Wei <weihanyu@amazon.com>

Address joshuali925's review on opensearch-project#12255: restructure the config path from queryEnhancements.pplLint.enabled to queryEnhancements.ppl.lint.enabled so future languages/features (ppl.autocomplete, sql.lint) extend the same shape. The wire capability stays a flat boolean queryEnhancements.pplLint, so capability consumers are unaffected; only the DynamicConfigService read in the switcher changes (config.ppl?.lint?.enabled === true). Signed-off-by: Hanyu Wei <weihanyu@amazon.com>

ps48

Thanks for the fixes

github-actions · 2026-06-23T22:23:18Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 10285e6.

Path	Line	Severity	Description
src/plugins/query_enhancements/server/routes/ppl_calcite_settings.ts	68	medium	Deliberate fail-open on authentication failures (401/403): when the cluster-settings request is rejected for auth reasons, the route returns HTTP 200 with calciteEnabled:true rather than propagating the auth error. This means a permission failure silently enables the Calcite code path instead of blocking it, potentially allowing lint features to activate in environments where the operator explicitly denied access.
src/plugins/query_enhancements/server/routes/ppl_explain.ts	44	low	The route accepts an arbitrary PPL query string (up to 64 KB) from the request body and forwards it verbatim to the internal OpenSearch /_plugins/_ppl/_explain endpoint. While protected by the user's own credentials via resolveOpenSearchClient, this proxy pattern could be used to probe internal OpenSearch cluster topology or exercise OpenSearch parser edge cases. No sanitization beyond length bounds is applied.

The table above displays the top 10 most important findings.

Total: 2 | Critical: 0 | High: 0 | Medium: 1 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-06-23T22:24:12Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review Fail-open default may mislead clients On any transport error (including network/transient failures and 401/403), the route returns `{ calciteEnabled: true, allJoinTypesAllowed: false }`. This conflates "no Calcite plugin / cluster down / unauthorized" with "Calcite enabled," which can cause clients (lint rules) to apply Calcite-specific behavior on a cluster that does not support it. Consider returning a distinct error/unknown indicator (or at least differentiating 401/403 from other failures) so the client can decide rather than silently assuming enabled. } catch (err) { const status = (err as { statusCode?: number; meta?: { statusCode?: number } })?.statusCode; const metaStatus = (err as { meta?: { statusCode?: number } })?.meta?.statusCode; const message = err instanceof Error ? err.message : String(err); // Fail open: a missing/failed cluster-settings read must not block the // editor. Calcite is assumed enabled (the engine default) so lint rules // still run. Surface auth/permission failures at warn so an operator can // see them; everything else stays at debug. if (status === 401 \|\| status === 403 \|\| metaStatus === 401 \|\| metaStatus === 403) { logger.warn(`PPL calcite settings unauthorized (${status ?? metaStatus}): ${message}`); } else { logger.debug(`PPL calcite settings error: ${message}`); } return res.ok({ body: { calciteEnabled: true, allJoinTypesAllowed: false } }); }

github-actions · 2026-06-23T22:24:43Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Security	Avoid leaking backend error details to clients Forwarding the raw error `message` directly to the client may leak internal/backend details (stack traces, cluster paths, indices). Consider returning a generic message to the client while logging the detailed message server-side, especially for 5xx errors. src/plugins/query_enhancements/server/routes/ppl_explain.ts [56-61] const message = e.message ?? 'Failed to explain PPL query'; +const statusCode = coerceStatusCode(e.status ?? e.statusCode ?? e.meta?.statusCode); logger.debug(`PPL explain error: ${message}`); return res.custom({ - statusCode: coerceStatusCode(e.status ?? e.statusCode ?? e.meta?.statusCode), - body: message, + statusCode, + body: statusCode >= 500 ? 'Failed to explain PPL query' : message, }); Suggestion importance[1-10]: 5 __ Why: Reasonable security-hygiene suggestion to avoid leaking backend error details on 5xx responses, though the existing `definePPLBundleRoute` follows the same pattern, so impact is moderate.	Low
General	Make boolean string comparison case-insensitive The `String(raw)` normalization is applied for typed booleans, but comparing against the lowercase string `'true'` will not match `String(true)` results correctly only if values arrive capitalized. More importantly, for the `false` boolean case, `String(false)` produces `'false'` which correctly fails the `=== 'true'` check, but consider also accepting case-insensitive matches to be robust against different transport serializations. src/plugins/query_enhancements/server/routes/ppl_calcite_settings.ts [58-59] -calciteEnabled: resolveValue('plugins.calcite.enabled') === 'true', -allJoinTypesAllowed: resolveValue('plugins.calcite.all_join_types.allowed') === 'true', +calciteEnabled: resolveValue('plugins.calcite.enabled')?.toLowerCase() === 'true', +allJoinTypesAllowed: resolveValue('plugins.calcite.all_join_types.allowed')?.toLowerCase() === 'true', Suggestion importance[1-10]: 2 __ Why: OpenSearch cluster settings return lowercase `'true'`/`'false'` strings, and `String(true)` also produces lowercase `'true'`, so case-insensitive matching is unnecessary. Marginal defensive improvement.	Low

Hanyu-W requested review from FriedhelmWS and yubonluo as code owners June 18, 2026 23:10

github-actions Bot added the first-time-contributor label Jun 18, 2026

Swiddis approved these changes Jun 18, 2026

View reviewed changes

ps48 reviewed Jun 22, 2026

View reviewed changes

joshuali925 reviewed Jun 22, 2026

View reviewed changes

Hanyu Wei added 2 commits June 22, 2026 12:36

Hanyu-W requested review from joshuali925 and ps48 June 23, 2026 17:27

ps48 approved these changes Jun 23, 2026

View reviewed changes

mengweieric approved these changes Jun 23, 2026

View reviewed changes

Merge branch 'main' into ppl-lint-backend

10285e6

Conversation

Hanyu-W commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues Resolved

Screenshot

Testing the changes

Check List

Uh oh!

Choose a reason for hiding this comment

p0: backward-compat issue with calciteEnabled default

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Hanyu-W Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

p0: test locks in the absent-setting default

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

p1: add a direct multi-data-source test for resolveOpenSearchClient

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

p1: confirm request guarding for a per-keystroke route

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ All unit and integration tests passing

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ps48 left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions Bot commented Jun 23, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented Jun 23, 2026

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Hanyu-W commented Jun 18, 2026 •

edited

Loading

Hanyu-W Jun 22, 2026 •

edited

Loading

github-actions Bot commented Jun 22, 2026 •

edited

Loading