Skip to content

[SPARK-52729][SQL] Add MetadataOnlyTable and CREATE/ALTER VIEW support for DS v2 catalogs#51419

Open
cloud-fan wants to merge 40 commits intoapache:masterfrom
cloud-fan:v1-v2
Open

[SPARK-52729][SQL] Add MetadataOnlyTable and CREATE/ALTER VIEW support for DS v2 catalogs#51419
cloud-fan wants to merge 40 commits intoapache:masterfrom
cloud-fan:v1-v2

Conversation

@cloud-fan
Copy link
Copy Markdown
Contributor

@cloud-fan cloud-fan commented Jul 9, 2025

What changes were proposed in this pull request?

This PR exposes a DS v2 API for metadata-only tables (read side), CREATE VIEW, and ALTER VIEW ... AS (write side) so that third-party v2 catalogs can participate in Spark's resolution and creation flows without reimplementing read/write themselves.

1. Read path — MetadataOnlyTable:

  • New Table implementation that carries a TableInfo and delegates everything to it. Catalogs return it from loadTable to signal "Spark, interpret this via V1 paths" — data-source reads for file-source tables, view-text expansion for views (the latter via the perf opt-in described in section 4).
  • Analyzer.lookupTableOrView and RelationResolution.createRelation detect MetadataOnlyTable and route through a new V1Table.toCatalogTable adapter to the existing V1 data-source / view machinery.

2. Shared DTO — TableInfo:

  • TableInfo.Builder gains convenience setters that write reserved keys into properties: withProvider, withLocation, withComment, withCollation, withOwner, withTableType, plus withSchema(StructType). The read side (MetadataOnlyTable) and the write side (createTable(ident, TableInfo)) use the same struct. View-specific fields live on a typed subclass (see section 3) so they are not encoded as string properties.
  • withProperties takes a defensive copy so convenience setters don't mutate the caller's map.

3. Typed view DTO — ViewInfo:

  • ViewInfo extends TableInfo carries the view-specific fields that cannot be represented as string table properties: queryText, currentCatalog, currentNamespace (multi-part, never null; empty when no namespace was captured), sqlConfigs (unprefixed SQL config keys), schemaMode (BINDING / COMPENSATION / TYPE EVOLUTION / EVOLUTION), and queryColumnNames (mapping query output to the view's declared columns; empty in EVOLUTION mode).
  • ViewInfo.Builder extends TableInfo.BaseBuilder<Builder> adds typed setters: withQueryText, withCurrentCatalog, withCurrentNamespace, withSqlConfigs, withSchemaMode, withQueryColumnNames. The inherited TableInfo.BaseBuilder setters (schema, properties, owner, comment, collation, etc.) are available on the same builder so view and table writes share one fluent API.
  • The ViewInfo constructor stamps PROP_TABLE_TYPE = TableSummary.VIEW_TABLE_TYPE into properties() so catalogs and generic viewers reading PROP_TABLE_TYPE from the properties bag (e.g. TableCatalog.listTableSummaries default impl, DESCRIBE) classify the entry as VIEW without requiring authors to remember withTableType(VIEW).
  • ViewInfo is the typed payload returned by ViewCatalog.loadView and accepted by createView / replaceView. It still extends TableInfo so a mixed catalog can opt into the perf path described in section 4 (returning MetadataOnlyTable(ViewInfo) from loadTable); pure view-only catalogs never see TableInfo directly because the typed builder covers everything they construct.

4. View support — ViewCatalog interface:

  • A new ViewCatalog interface is the plugin-facing API for views. It is independent from TableCatalog: a connector implements just ViewCatalog (view-only catalog), just TableCatalog (table-only catalog), or both (mixed catalog like Hive / Iceberg / Unity Catalog). There is no capability flag — interface presence is the signal.
  • API: listViews(namespace), loadView(ident) returning ViewInfo, createView(ident, ViewInfo), replaceView(ident, ViewInfo), dropView(ident), default viewExists(ident), default invalidateView(ident). No staging variant — replaceView is a single atomic-swap call.
  • Mixed-catalog rule (documented on the class): tables and views share one identifier namespace. Each interface's methods behave as if the other kind didn't exist; the only cross-cutting invariant is that createTable rejects view-collisions and createView rejects table-collisions (one extra existence check the catalog already needs internally).
  • Mixed-catalog perf opt-in: loadTable may return a MetadataOnlyTable wrapping a ViewInfo for a view identifier; Spark's resolver discriminates by instanceof ViewInfo and routes through view resolution without a follow-up loadView RPC. If the catalog instead throws NoSuchTableException for a view identifier, Spark falls back to loadView (one extra RPC on cold cache).
  • Analyzer.lookupTableOrView (and RelationResolution.tryResolvePersistent): tries loadTable first only when the catalog is a TableCatalog (or the session catalog) — otherwise the underlying asTableCatalog cast would throw MISSING_CATALOG_ABILITY.TABLES for a pure ViewCatalog and mask the legitimate loadView fallback. On NoSuchTableException (or when loadTable is skipped), if the catalog is a ViewCatalog, calls loadView and synthesizes a ResolvedPersistentView from the resulting ViewInfo via V1Table.toCatalogTable(catalog, ident, viewInfo).
  • Catalogs that are not ViewCatalog get MISSING_CATALOG_ABILITY.VIEWS from the resolver gate (for UnresolvedView) and from CheckViewReferences (for CREATE / ALTER VIEW), matching the previous capability-flag rejection.

5. Write path — DS v2 CREATE VIEW:

  • DataSourceV2Strategy routes CreateView(ResolvedIdentifier(catalog, ident), …) to CreateV2ViewExec(catalog: ViewCatalog, …), which dispatches: createView for plain CREATE / IF NOT EXISTS; replaceView for CREATE OR REPLACE on an existing view (with a NoSuchViewException → createView fallback for the race where the view disappears between probe and replace); createView for CREATE OR REPLACE on a non-existent view. Cross-type collision (CREATE VIEW over a non-view table in a mixed catalog) is rejected up front with EXPECT_VIEW_NOT_TABLE.NO_ALTERNATIVE.
  • The exec builds the ViewInfo via a V2ViewPreparation trait reusing v1 ViewHelper helpers (aliasPlan, sqlConfigsToProps) to populate a ViewInfo.Builder with the current session's captured catalog/namespace and SQL configs. Cyclic-reference detection and auto-generated-alias rejection run once at analysis time in CheckViewReferences (see section 7).
  • CreateView logical plan extends AnalysisOnlyCommand (same shape as V2CreateTableAsSelectPlan) so HandleSpecialCommand.markAsAnalyzed captures referredTempFunctions from AnalysisContext. The v1 rewriting path (ResolveSessionCatalogCreateViewCommand) is unchanged.

6. Write path — DS v2 ALTER VIEW ... AS:

  • AlterViewAs logical plan also extends AnalysisOnlyCommand so referredTempFunctions is captured for the non-session path.
  • DataSourceV2Strategy routes AlterViewAs(ResolvedPersistentView(catalog, ident, _), …) to AlterV2ViewExec(catalog: ViewCatalog, …), which calls replaceView (the single atomic-swap entry point — no separate staging variant, since view REPLACE writes only metadata).
  • A V2AlterViewPreparation trait (extends V2ViewPreparation) calls catalog.loadView(ident) once and uses the result to preserve user TBLPROPERTIES, comment, collation, owner, and schema-binding mode when constructing the replacement ViewInfo. Session-scoped fields (SQL configs, query column names) are re-emitted by buildViewInfo() from the active SparkSession, matching v1 AlterViewAsCommand.alterPermanentView. A racing DDL between analysis and exec (the view dropped, or replaced with a non-view table in a mixed catalog) surfaces NoSuchViewException / EXPECT_VIEW_NOT_TABLE rather than a stale-resolution error.
  • ResolvedViewIdentifier.unapply (in ResolveSessionCatalog) replaces its assert(isSessionCatalog) with an if isSessionCatalog guard so non-session ResolvedPersistentView plans fall through to the v2 strategy instead of tripping the assertion.

7. Post-analysis check — CheckViewReferences:

  • New rule wired into BaseSessionStateBuilder.extendedCheckRules. Rejects permanent views that reference temporary objects and rejects view bodies with auto-generated aliases for both CreateView and AlterViewAs (v2 paths). v1 CreateViewCommand / AlterViewAsCommand keep their existing exec-time safety net — Dataset-built commands can be constructed with isAnalyzed=true directly and bypass the analyzer's re-capture path.

8. Listing — SHOW TABLES / SHOW VIEWS:

  • TableCatalog.listTables returns table identifiers only — views (if the catalog also implements ViewCatalog) are listed separately via ViewCatalog.listViews. listTableSummaries's default impl enumerates via listTables + loadTable and returns one summary per table. This is an intentional v2 divergence from v1 SHOW TABLES, which includes both tables and views; restoring the v1-parity output for SHOW TABLES on a v2 catalog (i.e. routing it through both listTables and listViews) is left as a follow-up so this PR's API surface stays narrowly scoped.
  • SHOW VIEWS on a non-session ViewCatalog is routed through a new ShowViewsExec that enumerates via ViewCatalog.listViews(namespace). ResolveSessionCatalog.ShowViews skips (via guard) for ViewCatalog catalogs so they fall through to this strategy; non-session, non-ViewCatalog catalogs still hit the existing MISSING_CATALOG_ABILITY.VIEWS rejection. v2 catalogs have no temp views, so the isTemporary column is always false (mirroring v1, which only sets it true for local/global temp views).

Why are the changes needed?

A v2 Table is not always backed by a connector that implements read/write. Catalogs like HMS and Unity Catalog store only metadata and rely on Spark to interpret the table provider as a data source or to execute the view SQL. Previously the only way to achieve that was a hack around V1Table, which leaks private v1 types into v2 connectors (example: https://github.com/unitycatalog/unitycatalog/blob/main/connectors/spark/src/main/scala/io/unitycatalog/spark/UCSingleCatalog.scala).

Separately, v2 catalogs had no public way to handle CREATE VIEW or ALTER VIEW. ResolveSessionCatalog rejected CREATE VIEW on any non-session catalog with MISSING_CATALOG_ABILITY.VIEWS, so third-party catalogs could not own view lifecycle at all. The new ViewCatalog interface gives catalogs a clean view-shaped API (listViews / loadView / createView / replaceView / dropView) that is independent of TableCatalog: a view-only catalog implements just ViewCatalog (no TableCatalog boilerplate), a mixed catalog implements both, and the cross-type collision invariant is one extra existence check at createTable / createView time.

Does this PR introduce any user-facing change?

Yes to connector developers:

  • Third-party v2 TableCatalog implementations can now return a MetadataOnlyTable from loadTable to delegate reads to Spark.
  • Third-party v2 connectors can implement the new ViewCatalog interface to handle CREATE VIEW / CREATE OR REPLACE VIEW / CREATE VIEW IF NOT EXISTS / ALTER VIEW … AS / DROP VIEW / SHOW VIEWS, with view text, schema, captured current catalog+namespace, SQL configs, and temp-object-reference rejection handled the same way as for session-catalog views. View-only catalogs implement just ViewCatalog; mixed catalogs implement both TableCatalog and ViewCatalog.

No SQL-level or user-visible behavior change for existing deployments.

Remaining work (follow-up PRs)

This PR covers the core read path, CREATE VIEW (all shapes), ALTER VIEW ... AS, DROP VIEW, and SHOW VIEWS. The following view-scoped plans for DS v2 catalogs are not yet supported and are tracked for follow-ups. Until the follow-ups land, each currently surfaces a clean UNSUPPORTED_FEATURE.TABLE_OPERATION error (wired up in DataSourceV2Strategy and pinned by tests in DataSourceV2MetadataOnlyViewSuite), so users get a meaningful message rather than a generic planner failure:

  • ALTER VIEW ... SET/UNSET TBLPROPERTIES — separate logical plans (SetViewProperties, UnsetViewProperties); need their own DataSourceV2Strategy cases backed by new TableChange routing.
  • ALTER VIEW ... RENAME TORenameTable at the logical level; needs v2 view awareness and the catalog-side rename semantics.
  • ALTER VIEW ... WITH SCHEMA BINDINGAlterViewSchemaBinding logical plan; needs the same treatment as ALTER VIEW AS (AnalysisOnlyCommand shape + v2 exec).
  • DESCRIBE / SHOW CREATE TABLE / SHOW TBLPROPERTIES / SHOW COLUMNS on v2 views — currently route through ResolvedViewIdentifier which only matches session-catalog views; the v2 equivalents need dedicated handling.
  • SHOW TABLES v1-parity output (include views) on a v2 catalog — TableCatalog.listTables now intentionally returns tables only; restoring v1-parity in the SQL layer (route SHOW TABLES through both listTables and listViews) is a separate piece of work.

How was this patch tested?

New DataSourceV2MetadataOnlyTableSuite covering:

  • Reads: Metadata-only file-source reads (SELECT / INSERT / INSERT OVERWRITE, partitioned and non-partitioned); views with fully-qualified references + view SQL configs; views with unqualified references that expand via the stored current catalog+namespace.
  • CREATE VIEW (plain TableCatalog): end-to-end CREATE, CREATE VIEW IF NOT EXISTS (no-op on existing), CREATE on existing (failure), CREATE OR REPLACE VIEW (replacement); user-specified columns (too-few / too-many); DEFAULT COLLATION propagation into ViewInfo.
  • Pure-ViewCatalog end-to-end (no TableCatalog mixin): dedicated TestingViewOnlyCatalog fixture exercises view-text expansion on read + ALTER VIEW … AS against a pure ViewCatalog, ensuring the resolver's loadView fallback fires correctly when loadTable is skipped.
  • CREATE VIEW over a non-view table (v1-parity): rejects CREATE OR REPLACE VIEW against a non-view table entry with EXPECT_VIEW_NOT_TABLE.NO_ALTERNATIVE; rejects plain CREATE VIEW against a non-view table entry with TABLE_OR_VIEW_ALREADY_EXISTS; CREATE VIEW IF NOT EXISTS over a table is a no-op (matches v1 SQLViewSuite "existing a table with the duplicate name when CREATE VIEW IF NOT EXISTS").
  • CREATE / ALTER / DROP / SHOW VIEW on a catalog that doesn't implement ViewCatalog: a dedicated TestingTableOnlyCatalog exercises the rejection on each path (expected MISSING_CATALOG_ABILITY.VIEWS).
  • CheckViewReferences: permanent view referencing a temp function rejected; permanent view referencing a temp view rejected.
  • ALTER VIEW ... AS: end-to-end body replacement; rejects temp-function reference; preserves user-set TBLPROPERTIES; preserves PROP_OWNER and SCHEMA EVOLUTION binding mode across the ALTER (v1-parity); missing view surfaces as AnalysisException.
  • Multi-part captured namespace round-trip: unit-level test that ViewInfo.Builder.withCurrentCatalog(cat).withCurrentNamespace([db1, db2]) -> V1Table.toCatalogTable -> CatalogTable.viewCatalogAndNamespace preserves the full multi-part form, including namespace parts containing dots (which flow through structurally, not via any string encoding). A companion test pins the absent-branch: a ViewInfo built without withCurrentCatalog yields an empty viewCatalogAndNamespace.
  • Unsupported v2 view DDL / inspection pinning: SET TBLPROPERTIES, UNSET TBLPROPERTIES, WITH SCHEMA, RENAME TO, SHOW CREATE TABLE, SHOW TBLPROPERTIES, SHOW COLUMNS, DESCRIBE TABLE against a v2 view all surface UNSUPPORTED_FEATURE.TABLE_OPERATION; DESCRIBE TABLE ... COLUMN surfaces a clean AnalysisException. Pins the current failure mode so a future regression to a generic planner error is caught in the diff.
  • SHOW TABLES / SHOW VIEWS on a v2 catalog: SHOW TABLES returns tables only; SHOW VIEWS returns views only (isTemporary=false throughout for v2 catalogs); SHOW VIEWS ... LIKE filters on the view name; SHOW VIEWS against a non-ViewCatalog is rejected with MISSING_CATALOG_ABILITY.VIEWS.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude (Anthropic)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current view implementation which stores the original SQL text and a bunch of context is too convoluted to put into the public DS v2 API. It's better if the view text is context-independent. We are going to improve it in #51410

Before the improvement is done, we only allow to read DS v2 views that has context-independent SQL text.

Copy link
Copy Markdown
Contributor

@aokolnychyi aokolnychyi Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we store the current catalog and namespace in the view metadata? Do we expect the connectors to modify the view SQL text?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposal is to let Spark modify the view text before saving it into the catalog, so that the catalog does not need to store the current catalog/namespace.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So all identifiers in the view text will always include the catalog name as the first name part and the table name as the last name part? How hard will it be to modify the original SQL text? Will it cause any surprises to the users if the original and the persisted SQL text differ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So all identifiers in the view text will always include the catalog name as the first name part and the table name as the last name part? How hard will it be to modify the original SQL text? Will it cause any surprises to the users if the original and the persisted SQL text differ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hive already did so and its view has both "view_text" and "original_view_text" fields. All identifiers should be fully qualified (with catalog name and namespace) in the view text.

@cloud-fan
Copy link
Copy Markdown
Contributor Author

cc @aokolnychyi @gengliangwang

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about SPARK_TABLE_OR_VIEW

Comment thread sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/V1Table.scala Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we explicitly mention what operations will be affected?(read/write/DDL/...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capability has since been reshaped: TableCapability is gone in favor of the concrete MetadataOnlyTable for the read side, and a TableCatalogCapability.SUPPORTS_VIEW gate for the write side. The SUPPORTS_VIEW javadoc now spells out the affected operations explicitly (CREATE VIEW / CREATE OR REPLACE VIEW / CREATE VIEW IF NOT EXISTS via createTable, ALTER VIEW ... AS via dropTable+createTable or stageReplace, and the read-path round-trip through MetadataOnlyTable). PTAL and let me know if anything is still unclear.

@aokolnychyi
Copy link
Copy Markdown
Contributor

I'd love to take a look on Monday.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand we may want to use the new API to expose views, but what about the case with Spark table? When would this be helpful?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, UC does not want to rely on the Spark file source, but just set the table provider to the file source name and leave read/write to Spark. Today this is done by a hack with V1Table: https://github.com/unitycatalog/unitycatalog/blob/main/connectors/spark/src/main/scala/io/unitycatalog/spark/UCSingleCatalog.scala#L303

Copy link
Copy Markdown
Contributor

@aokolnychyi aokolnychyi Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. The use case is catalogs that govern tables but not necessarily implement read/write logic.

Have we considered offering a generic V2 table implementation for built-in formats that would be accessible to external connectors? Essentially, a public version of V1Table that doesn't need to expose CatalogTable. If we go with the table capability approach, then each connector will have to implement a custom V2 table for it to be simply replaced as a Parquet table or view. Each connector would have to be aware of how the translation will be done in Spark. For instance, it seems like we assume that serdeProps will be with prefixed with option..

Just curious to know your thinking, I don't have a strong opinion here.

Copy link
Copy Markdown
Contributor Author

@cloud-fan cloud-fan Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this makes sense. A concrete class is easier to use than a table capability.

@cloud-fan cloud-fan force-pushed the v1-v2 branch 2 times, most recently from aaa56bc to b5c909e Compare July 29, 2025 16:10
@cloud-fan cloud-fan changed the title [SPARK-52729][SQL] Add GENERAL_TABLE v2 table capacity [SPARK-52729][SQL] Add DataSourceTableOrView in DS v2 API Jul 29, 2025
@cloud-fan cloud-fan force-pushed the v1-v2 branch 2 times, most recently from 5d5a508 to 5517dee Compare July 29, 2025 16:25
@cloud-fan cloud-fan changed the title [SPARK-52729][SQL] Add DataSourceTableOrView in DS v2 API [SPARK-52729][SQL] Add MetadataOnlyTable in DS v2 API Jul 31, 2025
}

// TODO: move the v2 data source table handling from V2SessionCatalog to the analyzer
ignore("v2 data source table") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add this test case later?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be fixed shortly after the PR is merged.

* implementing read/write directly. It represents a general Spark data source table or
* a Spark view, and relies on Spark to interpret the table metadata, resolve the table
* provider into a data source, or read it as a view.
* This affects the table read/write operations but not DDL operations.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This affects the table read/write operations but not DDL operations.

It seems a bit unclear. Before the change, DDL operations of DSV2 tables relies on Spark too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should just remove this line? DDL operations do not call read/write APIs of Table anyway.

Copy link
Copy Markdown
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, just wondering if some of it can be less bound to HMS concepts

return this;
}

public Builder withSerdeProps(Map<String, String> serdeProps) {
Copy link
Copy Markdown
Member

@szehon-ho szehon-ho Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an opinion, serde is quite bound to HMS (?) and the newer DSV2 Catalog dont have that, will it make sense to abstract this (maybe something like 'storageProperties')? This is just for HMS table support?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I was trying to save the work of splitting serde properties from table properties, but API simplicity is more important.

database = Some(ident.namespace().lastOption.getOrElse("root")),
catalog = Some(catalog.name())),
tableType = tableType,
storage = CatalogStorageFormat.empty.copy(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, do we not need to set inputFormat/outputFormat?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V1Table itself does not access the input/output format when translating v1 table to v2 table. I think Hive table is never supported by DS v2 and we will not handle it here.

Copy link
Copy Markdown
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely familiar with the end use case, but the API looks good now, thanks

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @cloud-fan . Do you have a plan to proceed this forward?

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 1, 2026

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions Bot added the Stale label Feb 1, 2026
@github-actions github-actions Bot closed this Feb 2, 2026
@cloud-fan cloud-fan reopened this Apr 22, 2026
The capability now gates both CREATE VIEW and ALTER VIEW, so the create-only
name misrepresents the feature set. "SUPPORTS_VIEW" reads like the other
TableCatalogCapability entries (SUPPORTS_CREATE_TABLE_* are about creation
only; view support is the full lifecycle).

Co-authored-by: Isaac
@cloud-fan cloud-fan changed the title [SPARK-52729][SQL] Add MetadataOnlyTable and CREATE VIEW support for DS v2 catalogs [SPARK-52729][SQL] Add MetadataOnlyTable and CREATE/ALTER VIEW support for DS v2 catalogs Apr 22, 2026
… tests

- uncacheTableOrView now uses ResolvedIdentifier overload so multi-level
  namespaces aren't narrowed to a single database part
- V2AlterViewPreparation.viewSchemaMode delegates to CatalogTable.viewSchemaMode
  to match v1 defaults and honor viewSchemaBindingEnabled
- drop unused referredTempFunctions field from V2 view execs
- gate on TableCatalog + SUPPORTS_VIEW together in DataSourceV2Strategy so
  non-TableCatalog plugins still see MISSING_CATALOG_ABILITY.VIEWS
- add tests for temp variable rejection and cyclic v2 view references
- split DataSourceV2MetadataOnlyTableSuite into table-read and view suites
- doc polish: PROP_VIEW_TEXT, MetadataOnlyTable javadoc, stale comments
…alyzer

- Add CatalogTable.fullIdentOpt / fullIdent so v2 catalogs with multi-level
  namespaces (via MetadataOnlyTable) can carry the real [catalog, ns..., name]
  that v1 TableIdentifier can't represent.
- V1Table.toCatalogTable populates fullIdentOpt from the v2 identifier.
- SessionCatalog.getRelation uses fullIdentOpt for the SubqueryAlias qualifier,
  falling back to qualifyIdentifier for v1 session-catalog tables. Fixes
  fully-qualified column references against non-session v2 catalogs (qualifier
  was hardcoded to spark_catalog).
- checkCyclicViewReference and recursiveViewDetectedError now take Seq[String]
  and compare via CatalogTable.fullIdent, so views in multi-level namespaces
  sharing the last segment (cat.ns1.a.v vs cat.ns2.a.v) no longer collide.
- Move the v2-path cyclic-view check from the four exec sites into the new
  CheckViewReferences analyzer rule, gated on replace for CreateView. v1 keeps
  its exec-time check as the Dataset API safety net.
- Replace two non-ASCII em-dashes in V1Table.scala comments with ASCII.
- Tests: fully-qualified column reference on v2 catalog (TableSuite), cyclic
  detection across multi-level namespaces for both CREATE OR REPLACE and
  ALTER paths (ViewSuite).

Co-authored-by: Isaac
- CreateV2ViewExec / AtomicCreateV2ViewExec: replace the separate
  `tableExists` + implicit-assume-view flow with a single `loadTable`
  round-trip and a `MetadataOnlyTable` + PROP_TABLE_TYPE=VIEW check.
  REPLACE'ing a non-view table as a view is rejected with
  EXPECT_VIEW_NOT_TABLE.NO_ALTERNATIVE; plain CREATE surfaces
  TABLE_OR_VIEW_ALREADY_EXISTS; IF NOT EXISTS remains a no-op. Matches
  v1 CreateViewCommand semantics.
- V2AlterViewPreparation: stop stripping TABLE_RESERVED_PROPERTIES from
  the existing view's properties. PROP_OWNER (and other non-transient
  reserved fields) now flow through unchanged, matching v1
  AlterViewAsCommand.alterPermanentView's viewMeta.copy semantics. Keys
  the ALTER actually changes are overwritten downstream.
- CheckViewReferences: collapse duplicated legacyNameFor/fullIdentFor
  extractors onto a shared `catalogAndIdent` helper.
- Tests: add three new cases - CREATE/REPLACE-over-non-view-table
  rejection on both plain and staging catalogs, PROP_OWNER preservation
  across ALTER VIEW AS, and SCHEMA EVOLUTION mode preservation across
  ALTER VIEW AS.
Copy link
Copy Markdown
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review of the updated PR.

def unapply(resolved: LogicalPlan): Option[TableIdentifier] = resolved match {
case ResolvedPersistentView(catalog, ident, _) =>
assert(isSessionCatalog(catalog))
case ResolvedPersistentView(catalog, ident, _) if isSessionCatalog(catalog) =>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above says non-session views "fall through so they can be picked up by v2 strategies." Only AlterViewAs is actually picked up. ResolvedViewIdentifier is also used by SetViewProperties (line 176), UnsetViewProperties (line 179), AlterViewSchemaBinding (line 520), RenameTable-on-view (line 194), and DescribeRelation-on-view (line 198-199) — none of which have a v2 strategy case. After this PR a user can CREATE VIEW view_catalog.ns.v ... and then ALTER VIEW view_catalog.ns.v SET TBLPROPERTIES('k'='v') produces a generic planner "no plan for..." failure. Pre-PR, CreateView was rejected on non-session catalogs, so this orphan state was structurally unreachable; the pre-PR error path (MISSING_CATALOG_ABILITY.VIEWS) is no longer.

There is no test coverage for any of these five plans on a v2 view: DataSourceV2MetadataOnlyViewSuite has no test for SET TBLPROPERTIES, UNSET TBLPROPERTIES, WITH SCHEMA BINDING, RENAME TO, DESCRIBE, SHOW TBLPROPERTIES, or SHOW CREATE VIEW against a v2-catalog view (the only WITH SCHEMA EVOLUTION reference is on the CREATE VIEW path, line 462). The TODO at lines 36-37 of that suite acknowledges these as follow-ups, but nothing pins the current failure mode, so any future change (e.g., a reshape of planner error classes) can silently regress the UX further.

Two options:

  1. Pin the current behavior — for each of the five plan types, add a test that runs the statement against a v2 view and asserts the error it throws today. Future changes then surface in the diff.
  2. Close the gap up front — add explicit DataSourceV2Strategy cases (or a fall-through in ResolveSessionCatalog) that throw a clean UNSUPPORTED_FEATURE / FEATURE_NOT_YET_SUPPORTED naming the statement. Tests become one-per-plan and the UX doesn't regress between this PR and the follow-ups.

Option 2 is the cleaner closure given this PR is already landing the architectural change that enables the orphaning. Option 1 is the minimum safety net.

partitionColumnNames = partCols,
bucketSpec = bucketSpec,
owner = props.getOrElse(TableCatalog.PROP_OWNER, "unknown"),
viewText = viewText,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

viewText is read and assigned unconditionally. If a catalog returns a MetadataOnlyTable with PROP_VIEW_TEXT set but PROP_TABLE_TYPE is EXTERNAL/MANAGED (misconfiguration, or a future capability-composition catalog), the synthesized CatalogTable ends up with non-None viewText on a non-view — confusing downstream code that uses viewText.isDefined as an "is-view" proxy. Scoping the read to VIEW costs nothing:

Suggested change
viewText = viewText,
val viewText = if (tableType == CatalogTableType.VIEW) {
props.get(TableCatalog.PROP_VIEW_TEXT)
} else {
None
}

// asTableCatalog would throw).
val tableCatalog = catalog match {
case tc: TableCatalog
if tc.capabilities().contains(TableCatalogCapability.SUPPORTS_VIEW) => tc
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup on the CheckViewReferences side with the new catalogAndIdent helper. The same case tc: TableCatalog if tc.capabilities().contains(TableCatalogCapability.SUPPORTS_VIEW) => tc; case _ => throw missingCatalogViewsAbilityError(catalog) pattern is still duplicated at lines 310-314 (CREATE VIEW) and 330-334 (ALTER VIEW AS). Similarly the TableIdentifier(ident.name, ident.namespace.lastOption, Some(catalog.name)) idiom for error-rendering is repeated in CreateV2ViewExec:60-64 and V1Table.toCatalogTable:153-156. Small helpers — e.g. CatalogV2Util.requireViewSupport(catalog) and an asLegacyTableIdentifier(catalogName) on IdentifierHelper — would eliminate the remaining drift risk. Non-blocking.

}
}

test("ALTER VIEW on a catalog without SUPPORTS_VIEW fails") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestingTableOnlyCatalog.loadTable always throws NoSuchTableException, so the ALTER fails at view resolution — the capability gate in DataSourceV2Strategy (line 330-333) is never reached. The test body's own comment acknowledges this. As a result the capability-gate rejection on the ALTER path has no real coverage.

Two fix options:

  1. Rename to "ALTER VIEW on a missing view fails" — matches what it actually tests.
  2. Better: extend TestingTableOnlyCatalog to store a MetadataOnlyTable view so the gate rejection is genuinely exercised. This would also catch regressions if SUPPORTS_VIEW is inadvertently added to the default capability set.

}
}

test("read view resolves unqualified refs via captured current catalog/namespace") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR's multi-level-namespace correctness hinges on the QuotingUtils.quotedparseMultipartIdentifier round-trip and the v1 unqualified-reference expansion both preserving the captured namespace. Today that's covered by (a) a builder-level serialization test (line 83) and (b) cycle-detection tests using ns1.inner.v-style identifiers. But there is no end-to-end read test that exercises the round-trip by actually resolving an unqualified reference inside a view whose captured namespace has >1 part. Add one: .withCurrentCatalogAndNamespace("spark_catalog", Array("db1", "db2")), create a table in that namespace, reference it unqualified in the view body, and check the result.

@github-actions github-actions Bot removed the Stale label Apr 23, 2026
… errors, new tests

- drop unused `viewOnly` parameter on `Analyzer.lookupTableOrView`
- reorder `CreateV2ViewExec`/`AtomicCreateV2ViewExec` to short-circuit IF NOT EXISTS
  before building the TableInfo, matching v1 `CreateViewCommand.run`
- extract `CatalogTable.viewSchemaModeFromProperties` so `V2AlterViewPreparation` no
  longer round-trips through `V1Table.toCatalogTable` just to read the mode
- cross-reference v1/v2 view-check locations in `CreateViewCommand` and
  `AlterViewAsCommand` Scaladoc
- document `TableInfo.Builder.withProperties` / convenience-setter ordering on
  `withProperties` itself and add brief docs to the convenience setters
- require a view-typed `MetadataOnlyTable` at ALTER VIEW exec time (tightens the
  race-between-analysis-and-exec surface)
- rename `CatalogTable.fullIdentOpt` to `multipartIdentifier`
- widen `viewDepthExceedsMaxResolutionDepthError` to take `Seq[String]` so v2
  multi-level namespaces are reflected in the error message
- move the `SUPPORTS_VIEW` gate from `DataSourceV2Strategy` into
  `CheckViewReferences`; strategy cases now cast directly since analysis verifies
  the capability first
- add regression tests: ALTER VIEW re-captures current session SQL configs;
  CREATE OR REPLACE VIEW whose new body references a nonexistent table fails at
  analysis

Co-authored-by: Isaac
… doc reconciliation

Read path:
- V1Table.toCatalogTable: gate viewText read on tableType == VIEW so a
  non-view MetadataOnlyTable with a stray PROP_VIEW_TEXT doesn't synthesize
  a non-view CatalogTable with non-None viewText.

ALTER VIEW execs:
- Replace `val _ = existingTable` (obscure lazy-val side effect) with a
  named `requireExistingView()` helper in V2AlterViewPreparation.
- Race between analysis and exec (target dropped or replaced as a non-view
  between lookup and run) now surfaces as EXPECT_VIEW_NOT_TABLE instead of
  SparkException.internalError.
- AtomicCreateV2ViewExec: reject plain CREATE on an existing view up front
  with viewAlreadyExists(), matching the non-atomic exec (non-atomic path
  relied on catalog-side TableAlreadyExistsException, which StagingTableCatalog
  doesn't formally require).

Orphan-plan pinning:
- Add DataSourceV2Strategy cases for v2-catalog plans that ResolveSessionCatalog
  no longer rewrites: SetViewProperties, UnsetViewProperties, AlterViewSchemaBinding,
  RenameTable, ShowCreateTable, ShowTableProperties, ShowColumns, DescribeRelation,
  DescribeColumn on ResolvedPersistentView. Each throws UNSUPPORTED_FEATURE.TABLE_OPERATION
  naming the statement, pinning the current UX until the follow-up PRs land.

SHOW VIEWS for v2:
- New ShowViewsExec enumerates via TableCatalog.listTableSummaries(namespace)
  and filters to TableSummary.VIEW_TABLE_TYPE; wired in DataSourceV2Strategy.
- ResolveSessionCatalog's ShowViews handler now skips (via guard) for
  SUPPORTS_VIEW catalogs so they reach the v2 strategy; non-session, non-
  SUPPORTS_VIEW catalogs still get the MISSING_CATALOG_ABILITY.VIEWS rejection.

API contract reconciliation:
- Javadocs on TableCatalog.loadTable / dropTable / tableExists / alterTable /
  renameTable / listTables / purgeTable and StagingTableCatalog.stageCreate /
  stageReplace / stageCreateOrReplace now spell out the SUPPORTS_VIEW split:
  loadTable returns views as MetadataOnlyTable, dropTable/tableExists/listTables
  include views (listTables also includes views for v1 parity with SHOW TABLES),
  while alterTable / renameTable / purgeTable / versioned+timestamped loadTable
  remain table-only.
- Add IdentifierHelper.asLegacyTableIdentifier(catalogName) to share the
  lossy multi-part -> v1 TableIdentifier idiom; use in V1Table.toCatalogTable,
  V2ViewPreparation.legacyName, CheckViewReferences.legacyNameFor.

Misc:
- ResolveSessionCatalog: rename local var `child` -> `query` in CreateView
  pattern to match the case-class field name; update the stale
  ResolvedViewIdentifier comment to describe the new v2-strategy behavior.

Tests:
- New multi-part namespace round-trip unit test in DataSourceV2MetadataOnlyViewSuite
  (Builder -> V1Table.toCatalogTable -> viewCatalogAndNamespace preserves [cat, db1, db2]).
- Orphan-plan pinning tests: UNSUPPORTED_FEATURE.TABLE_OPERATION for SET/UNSET
  TBLPROPERTIES, WITH SCHEMA, RENAME TO, SHOW CREATE TABLE, SHOW TBLPROPERTIES,
  SHOW COLUMNS, DESCRIBE TABLE; clean AnalysisException for DESCRIBE COLUMN
  (fails at column resolution before reaching the strategy).
- SHOW TABLES on a v2 catalog includes views (v1 parity); SHOW VIEWS returns
  only views; SHOW VIEWS with LIKE filter; SHOW VIEWS on non-SUPPORTS_VIEW
  rejected with MISSING_CATALOG_ABILITY.VIEWS.
- TestingTableOnlyCatalog now round-trips a view-typed MetadataOnlyTable so
  the ALTER VIEW capability-gate test actually reaches the gate (expected
  MISSING_CATALOG_ABILITY.VIEWS), closing a coverage hole.

Co-authored-by: Isaac
…comment, add multi-part captured-namespace read test

- CatalogV2Util.supportsView: shared predicate replacing duplicated TableCatalog+SUPPORTS_VIEW check in CheckViewReferences and ResolveSessionCatalog.
- DataSourceV2MetadataOnlyViewSuite: correct the misleading "body is validated first" comment around CREATE VIEW IF NOT EXISTS on the atomic exec (tryLoadTable short-circuits before buildTableInfo), and add an end-to-end SQL test exercising multi-part captured catalog/namespace round-trip for an unqualified view-body reference.

Co-authored-by: Isaac
Pre-PR, Analyzer.lookupTableOrView had a viewOnly gate that rejected all
UnresolvedView lookups on non-session catalogs up front with
UNSUPPORTED_FEATURE.CATALOG_OPERATION. That gate was removed earlier in this
PR. For non-SUPPORTS_VIEW catalogs the ALTER VIEW path now falls through to
CheckAnalysis, which surfaces TABLE_OR_VIEW_NOT_FOUND when the view does not
exist. Either error is acceptable; this aligns the test with the simpler
no-gate behavior.

Co-authored-by: Isaac
…jection on MISSING_CATALOG_ABILITY.VIEWS

Without the gate, ALTER VIEW variants on a non-SUPPORTS_VIEW v2 catalog fell
through to TABLE_OR_VIEW_NOT_FOUND when the view did not exist -- misleading,
since the catalog cannot host views at all. Bring back `lookupTableOrView`'s
`viewOnly` flag and reject non-session non-SUPPORTS_VIEW catalogs upfront.
Switch DROP VIEW's existing rejection path and the restored gate to use the
same MISSING_CATALOG_ABILITY.VIEWS error class CheckViewReferences already
uses for CREATE/ALTER VIEW AS, so users see one consistent error for the
"catalog does not support views" condition across all view DDL.

Co-authored-by: Isaac
…, drop defensive AlterViewAs gate, misc cleanups

- MetadataOnlyTable: drop the no-arg constructor (and the "data_source_table_or_view" placeholder it defaulted to); require callers to pass a name, typically ident.toString. Before this, DESCRIBE TABLE EXTENDED on a MetadataOnlyTable-backed table showed "Name: data_source_table_or_view" instead of the real identifier. Updated all (test-only) callsites and added a DESCRIBE pin.
- V2AlterViewPreparation.existingTable: fold through the parent trait's tryLoadTable helper so the load/view-check lives in one place.
- CheckViewReferences: remove the redundant requireSupportsView call on the AlterViewAs branch. The analyzer's lookupTableOrView(viewOnly=true) already rejects non-SUPPORTS_VIEW catalogs before we get here.
- V1Table.toCatalogTable: default owner to "" (matches v1 CatalogTable default) instead of "unknown".
- Tests: add ALTER VIEW rejections for temp views and temp variables to mirror the CREATE VIEW matrix; fix the stale ALTER-capability-gate test comment; add a DESCRIBE-extended pin for the MetadataOnlyTable name surface.
- Doc: fix a hardcoded line-number reference in DataSourceV2Strategy and a split Scaladoc link in v2Commands.ShowViews.

Co-authored-by: Isaac
…serve on ALTER; minor dash nit

Co-authored-by: Isaac
…ti-part error rendering

- Introduce ViewInfo extends TableInfo carrying typed fields (queryText,
  currentCatalog, currentNamespace, sqlConfigs, schemaMode,
  queryColumnNames). SUPPORTS_VIEW catalogs branch on `instanceof ViewInfo`
  inside createTable and the StagingTableCatalog staging variants;
  loadTable returns MetadataOnlyTable wrapping a ViewInfo for views.
  ViewInfo's ctor auto-sets PROP_TABLE_TYPE=VIEW so generic viewers
  (listTableSummaries default impl, DESCRIBE) classify correctly.

- Remove the property-bag encoding: PROP_VIEW_TEXT,
  PROP_VIEW_CURRENT_CATALOG_AND_NAMESPACE, VIEW_CONF_PREFIX gone from
  TableCatalog; the corresponding TABLE_RESERVED_PROPERTIES entries
  gone from CatalogV2Util; CatalogTable.VIEW_SQL_CONFIG_PREFIX reverted
  to its pre-PR form.

- Delete the dormant ViewCatalog / View / ViewChange / old ViewInfo
  (@DeveloperAPI but never wired into analyzer/planner) since
  TableCatalog + SUPPORTS_VIEW subsumes it.

- Fix multi-level-namespace rendering in four view error constructors:
  viewAlreadyExistsError, unsupportedCreateOrReplaceViewOnTableError,
  and the two CREATE_VIEW_COLUMN_ARITY_MISMATCH errors now take
  Seq[String] instead of a lossy TableIdentifier (asLegacyTableIdentifier
  collapsed cat.ns1.ns2.v to cat.ns2.v). v2 callers pass
  catalog.name +: ident.asMultipartIdentifier; v1 callers pass
  name.nameParts.

- MetadataOnlyTable.constraints() now delegates to info.constraints()
  instead of returning an empty array.
…es in temp-object errors; restore 3-part v1 session-catalog SubqueryAlias

Co-authored-by: Isaac
- PlanResolutionSuite drop-view v2: expect MISSING_CATALOG_ABILITY.VIEWS
  (Analyzer.lookupTableOrView now routes non-SUPPORTS_VIEW v2 catalogs
  through that error instead of UNSUPPORTED_FEATURE.CATALOG_OPERATION).
- explain golden files: regenerate to include CreateView.isAnalyzed in
  argString (new field from this PR's AnalysisOnlyCommand conversion).

Co-authored-by: Isaac
Copy link
Copy Markdown
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

# Conflicts:
#	sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
SPARK-39660 split v2 DESCRIBE TABLE PARTITION off into its own
DescribeTablePartition plan and dropped `partitionSpec` from
DescribeRelation. Our v2-view pin case had 4 wildcards; reduce to 3 to
match the new 3-field case class.

Co-authored-by: Isaac
Javadoc died mid-stream while generating
CatalogV2Implicits.IdentifierHelper.html (the failing PR's log stops
exactly there; the succeeding PRs continue past to MultipartIdentifierHelper
and CatalogV2Util). The only diff in IdentifierHelper on this branch was
the new asLegacyTableIdentifier method, whose scaladoc used `[[TableIdentifier]]`
/ `[[toQualifiedNameParts]]` / backtick-inlined code refs. Something in
that doc tripped javadoc into a hard exit (not a warning) instead of a
broken-link warning.

Fix: downgrade both new scaladoc blocks in the exposed-to-javadoc
connector/catalog package to plain `//` comments so genjavadoc doesn't
emit them into the Java stub at all:
- CatalogV2Implicits.IdentifierHelper.asLegacyTableIdentifier
- CatalogV2Util.supportsView (same risky pattern, hasn't been reached
  yet because javadoc died earlier, but would break next)

The method names are self-documenting; internal callers don't need the
scaladoc.

Co-authored-by: Isaac
Restore ViewCatalog as the plugin-facing API for view-only catalogs and
view DDL operations, instead of routing views through TableCatalog under
a SUPPORTS_VIEW capability flag. Catalog-implementer ergonomics:

  * Pure view-only: implement ViewCatalog. 5 methods (listViews,
    loadView/createView/replaceView/dropView), default viewExists. No
    instanceof, no capability declaration, no TableCatalog stubs.
  * Pure tables: implement TableCatalog. Same as today.
  * Mixed (Iceberg/UC): implement both interfaces independently. Single
    cross-cutting invariant -- one identifier namespace; createTable
    rejects view-collisions, createView rejects table-collisions.

ViewCatalog API:
  Identifier[] listViews(String[] namespace);
  ViewInfo loadView(Identifier);
  ViewInfo createView(Identifier, ViewInfo);
  ViewInfo replaceView(Identifier, ViewInfo);   // atomic per-call
  boolean dropView(Identifier);
  default boolean viewExists(Identifier);

No StagingViewCatalog -- view REPLACE writes only metadata, so a single
transactional metastore call (or equivalent) is sufficient. CREATE OR
REPLACE VIEW probes viewExists then dispatches createView/replaceView.

Spark-side dispatch:
  * Analyzer.lookupTableOrView: try TableCatalog.loadTable first; on
    NoSuchTableException, if catalog is ViewCatalog, fall back to
    loadView and synthesize ResolvedPersistentView.
  * Mixed-catalog perf opt-in: loadTable may return
    MetadataOnlyTable(ViewInfo) for view idents, short-circuiting the
    second RPC. Documented on TableCatalog#loadTable.
  * DataSourceV2Strategy: routes CREATE/ALTER/DROP/SHOW VIEWS to
    ViewCatalog only; staging branches removed.
  * ResolveSessionCatalog: SUPPORTS_VIEW guards replaced with
    instanceof ViewCatalog.

Internal: V1Table.toCatalogTable for ViewInfo is now public so the
analyzer can synthesize CatalogTable from a loadView result for the
session-catalog v1 view-resolution path.

Out of scope for this commit:
  * Test suite rewrite (DataSourceV2MetadataOnlyViewSuite still uses
    SUPPORTS_VIEW and TestingStagingCatalog) -- broken until the
    follow-up commit.
  * Lifting the session-catalog gate on DESCRIBE/SHOW CREATE TABLE/SHOW
    COLUMNS/SHOW TBLPROPERTIES for v2 views -- still pinned with
    UNSUPPORTED_FEATURE.TABLE_OPERATION; tracked as follow-up.

Co-authored-by: Isaac
…ewCatalog

The structural rework removed TableCatalogCapability.SUPPORTS_VIEW and
introduced ViewCatalog as the plugin-facing API for views. The existing
test catalogs (TestingViewCatalog, TestingStagingCatalog) now implement
both TableCatalog and ViewCatalog, sharing one identifier-keyed map per
the mixed-catalog contract. Storage value's runtime type (ViewInfo vs
TableInfo) distinguishes views from tables on each lookup; tableExists /
listTables exclude view entries, viewExists / listViews include only
views, and createTable / createView each reject cross-type collisions.

Test-name renames replace "without SUPPORTS_VIEW" with "without
ViewCatalog" to track the new API. The rest of the test bodies are
unchanged.

Co-authored-by: Isaac
…ec test names

- Analyzer.lookupTableOrView and RelationResolution.tryResolvePersistent skip
  CatalogV2Util.loadTable for pure ViewCatalogs (no TableCatalog mixin), so
  asTableCatalog no longer throws MISSING_CATALOG_ABILITY.TABLES and masks the
  legitimate loadView fallback. SELECT and ALTER VIEW now work end-to-end on a
  pure ViewCatalog.
- Add a TestingViewOnlyCatalog fixture (no TableCatalog mixin) plus read and
  ALTER VIEW tests that exercise the loadView fallback.
- DataSourceV2MetadataOnlyViewSuite: rename "uses the atomic exec" tests to
  reflect that view DDL routes through ViewCatalog.createView / replaceView
  (no separate staging variant); drop now-dead RecordingStagedTable; replace
  TestingStagingCatalog's stage* method bodies with explicit "must not be
  invoked by view DDL" throws so any future regression that misroutes through
  the staging API surfaces immediately.

Co-authored-by: Isaac
… over a non-view table; align SHOW TABLES test with new listTables contract; defensive null check on MetadataOnlyTable

- CreateV2ViewExec.run: probe both viewExists and tableExists up front.
  CREATE VIEW IF NOT EXISTS over a non-view table is now a no-op (v1 parity:
  see SQLViewSuite "existing a table with the duplicate name when CREATE VIEW
  IF NOT EXISTS"); the previous code called rejectIfTable() unconditionally
  before the allowExisting check and threw TABLE_OR_VIEW_ALREADY_EXISTS for
  what should be a no-op. Non-IF-NOT-EXISTS CREATE / OR REPLACE still surfaces
  the dedicated EXPECT_VIEW_NOT_TABLE / TABLE_OR_VIEW_ALREADY_EXISTS error.
  Drop the now-unused rejectIfTable / replaceArg trait helpers (and the
  AlterV2ViewExec override).
- DataSourceV2MetadataOnlyViewSuite: rename "SHOW TABLES on a v2 catalog
  includes views (v1 parity)" to "SHOW TABLES on a v2 catalog returns only
  tables" and flip the assertion. The new TableCatalog.listTables contract
  excludes views (per the file Javadoc); the previous test name + body asserted
  v1-parity which the implementation does not provide and ShowTablesExec is
  not changed by this PR. Documents the intentional v2 divergence.
- MetadataOnlyTable: Objects.requireNonNull on `info` and `name` so a
  connector that constructs the wrapper with nulls fails fast at
  construction time rather than producing cryptic NPEs in downstream
  consumers (DescribeTableExec's Name row, DataSourceV2Relation logging).

Co-authored-by: Isaac
- ViewInfo class doc: complete the dangling "construct." sentence with its
  direct object ("construct a ViewInfo") so the line reads as a complete
  thought.
- TableInfo Builder: replace the awkward use of "write" as a noun
  ("discards the convenience setter's write") with verb form ("discards
  the value the convenience setter wrote").

Co-authored-by: Isaac
…talog gating

Commit 66fa409 added `catalog.isInstanceOf[TableCatalog]` to
RelationResolution.tryResolvePersistent's gating but didn't add
TableCatalog to the explicit-list import block; CI failed at
catalyst compile with `not found: type TableCatalog`. Add the import.

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants