DX-118395: Make SearchTableAndViews resilient to broken catalog entries#103
Merged
Conversation
A single unresolvable view in the search results (e.g. a view with a disconnected source) caused the catalog-by-path lookup to return HTTP 400, which propagated through asyncio.gather and failed the entire SearchTableAndViews tool call. get_schemas now isolates per-path failures and returns the error message alongside each schema. SearchTableAndViews surfaces the affected paths in a `skipped` field plus a human-readable `skipped_note`, so the user sees the healthy results and a clear explanation for the ones that were dropped. Also adds CLAUDE.md with project context and development guidelines to help Claude Code sessions work more effectively in this repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
maxlepikhin
reviewed
Apr 22, 2026
maxlepikhin
previously approved these changes
Apr 22, 2026
aniket-s-kulkarni
left a comment
Contributor
There was a problem hiding this comment.
🟡 Verdict: Must-fix design gaps — 3 must fix, 1 should fix
Must fix:
get_schemasreturn type should be a PydanticBaseModel(e.g.SchemaResult) not a raw tuple —src/dremioai/api/dremio/catalog.pyget_descriptionsnow silently swallows catalog errors, changing its original fail-fast contract —src/dremioai/api/dremio/catalog.py- Missing server-side logging when a search hit's schema fetch fails —
src/dremioai/api/dremio/search.py
Should fix:
get_lineagestill callsget_schema(singular) and accessesresponse["id"]directly — coordinate now to avoid a future silent break ifget_schemais later changed to returnSchemaResult—src/dremioai/api/dremio/catalog.py:119
AI-assisted analysis
- Replace the untyped `Tuple[Dict, Optional[str]]` return of `get_schemas` with a `SchemaResult` Pydantic model (`data`, `error`). Gives callers named fields and a shape that can be extended without another signature change. - Fix `get_descriptions`: the original silent-drop-on-success was the actual root cause of DX-118395 (a broken VDS whose parent PDS was deleted tanked the entire call via asyncio.gather). It now tolerates per-path failures, logs each via structlog, and continues with the paths that resolved. - Log skipped paths in `get_search_results` via structlog (`search_schema_fetch_failed`) before recording them in `skipped`, so operators see the failures in logs during an incident. - Clean up imports: move `logger = log.logger(__name__)` below all `from ... import` lines in both `catalog.py` and `search.py`, and drop the now-unused `arbitrary_types_allowed` from `SchemaResult`. - Update tests to the new `SchemaResult` attribute access. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
@aniket-s-kulkarni I have incorporated all of your review responses. Can you please review again? |
aniket-s-kulkarni
previously approved these changes
Apr 24, 2026
…r and DataFrame workflow
aniket-s-kulkarni
approved these changes
Apr 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A single unresolvable view in the search results (e.g. a view with a disconnected source) caused the catalog-by-path lookup to return HTTP 400, which propagated through asyncio.gather and failed the entire
SearchTableAndViewstool call.In query engine catalogs, it's a common behavior not to delete VDS when dependent PDS is deleted. However, when the schema of such VDS is queried, catalog throws an error. In this fix, we are choosing to be defensive when catalog does not return schema of certain dataset.
get_schemasnow isolates per-path failures and returns the error message alongside each schema.SearchTableAndViewssurfaces the affected paths in askippedfield plus a human-readableskipped_note, so the user sees the healthy results and a clear explanation for the ones that were dropped.Also adds
CLAUDE.mdwith project context and development guidelines to help Claude Code sessions work more effectively in this repo.