Skip to content

Commit e56c1cf

Browse files
dennishuosfc-gh-dhuosfc-gh-mparmarmanisinnastra
authored
Add new SnowflakeCatalog implementation to enable directly using Snowflake-managed Iceberg tables (apache#6428)
* Initial read-only Snowflake Catalog implementation by @sfc-gh-mparmar (#1) Initial read-only Snowflake Catalog implementation built on top of the Snowflake JDBC driver, providing support for basic listing of namespaces, listing of tables, and loading/reads of tables. Auth options are passthrough to the JDBC driver. Co-authored-by: Maninder Parmar <maninder.parmar@snowflake.com> Co-authored-by: Maninder Parmar <maninder.parmar+oss@snowflake.com> Co-authored-by: Dennis Huo <dennis.huo+oss@snowflake.com> * Add JdbcSnowflakeClientTest using mocks (#2) Add JdbcSnowflakeClientTest using mocks; provides full coverage of JdbcSnowflakeClient and entities' ResultSetHandler logic. Also update target Spark runtime versions to be included. * Add test { useJUnitPlatform() } tuple to iceberg-snowflake for consistency and future interoperability with inheriting from abstact unittest base classes. * Extract versions into versions.props per PR review * Misc test-related refactors per review suggestions -Convert unittests to all use assertj/Assertions for "fluent assertions" -Refactor test injection into overloaded initialize() method -Add test cases for close() propagation -Use CloseableGroup. * Fix unsupported behaviors of loadNamedpaceMetadata and defaultWarehouseLocation * Move TableIdentifier checks out of newTableOps into the SnowflakTableOperations class itself, add test case. * Refactor out any Namespace-related business logic from the lower SnowflakeClient/JdbcSnowflakeClient layers and merge SnowflakeTable and SnowflakeSchema into a single SnowflakeIdentifier that also encompasses ROOT and DATABASE level identifiers. A SnowflakeIdentifier thus functions like a type-checked/constrained Iceberg TableIdentifier, and eliminates any tight coupling between a SnowflakeClient and Catalog business logic. Parsing of Namespace numerical levels into a SnowflakeIdentifier is now fully encapsulated in NamespaceHelpers so that callsites don't duplicate namespace-handling/validation logic. * Finish migrating JdbcSnowflakeClientTest off any usage of org.junit.Assert in favor of assertj's Assertions. * Style refactorings from review comments, expanded and moved InMemoryFileIO into core with its own unittest. * Fix behavior of getNamespaceMetadata to throw when the namespace doesn't exist. Refactor for naming conventions and consolidating identifier handling into NamespaceHandlers. Make FileIO instantiated fresh for each newTableOps call. * Move private constructor to top, add assertion to test case. * Define minimal ResultSetParser/QueryHarness classes to fully replace any use of commons-dbutils; refactor ResultSet handling fully into JdbcSnowflakeClient.java. * Update snowflake/src/main/java/org/apache/iceberg/snowflake/SnowflakeTableOperations.java Co-authored-by: Eduard Tudenhöfner <etudenhoefner@gmail.com> * Refactor style suggestions; remove debug-level logging, arguments in exceptions, private members if not accessed outside, move precondition checks, add test for NamespaceHelpers. * Fix precondition messages, remove getConf() * Clean up varargs. * Make data members final, include rawJsonVal in toString for debuggability. * Combine some small test cases into roundtrip test cases, misc cleanup * Add comment for why a factory class is exposed for testing purposes. Co-authored-by: Dennis Huo <dennis.huo@snowflake.com> Co-authored-by: Maninder Parmar <maninder.parmar@snowflake.com> Co-authored-by: Maninder Parmar <maninder.parmar+oss@snowflake.com> Co-authored-by: Eduard Tudenhöfner <etudenhoefner@gmail.com>
1 parent 7943e94 commit e56c1cf

File tree

19 files changed

+2415
-4
lines changed

19 files changed

+2415
-4
lines changed

.github/labeler.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,4 +81,6 @@ ALIYUN:
8181
GCP:
8282
- gcp/**/*
8383
DELL:
84-
- dell/**/*
84+
- dell/**/*
85+
SNOWFLAKE:
86+
- snowflake/**/*

build.gradle

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -697,6 +697,24 @@ project(':iceberg-dell') {
697697
}
698698
}
699699

700+
project(':iceberg-snowflake') {
701+
test {
702+
useJUnitPlatform()
703+
}
704+
705+
dependencies {
706+
implementation project(':iceberg-core')
707+
implementation project(':iceberg-common')
708+
implementation project(path: ':iceberg-bundled-guava', configuration: 'shadow')
709+
implementation "com.fasterxml.jackson.core:jackson-databind"
710+
implementation "com.fasterxml.jackson.core:jackson-core"
711+
712+
runtimeOnly("net.snowflake:snowflake-jdbc")
713+
714+
testImplementation project(path: ':iceberg-core', configuration: 'testArtifacts')
715+
}
716+
}
717+
700718
@Memoized
701719
boolean versionFileExists() {
702720
return file('version.txt').exists()

core/src/main/java/org/apache/iceberg/jdbc/JdbcClientPool.java

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,12 @@
2727
import org.apache.iceberg.CatalogProperties;
2828
import org.apache.iceberg.ClientPoolImpl;
2929

30-
class JdbcClientPool extends ClientPoolImpl<Connection, SQLException> {
30+
public class JdbcClientPool extends ClientPoolImpl<Connection, SQLException> {
3131

3232
private final String dbUrl;
3333
private final Map<String, String> properties;
3434

35-
JdbcClientPool(String dbUrl, Map<String, String> props) {
35+
public JdbcClientPool(String dbUrl, Map<String, String> props) {
3636
this(
3737
Integer.parseInt(
3838
props.getOrDefault(
@@ -42,7 +42,7 @@ class JdbcClientPool extends ClientPoolImpl<Connection, SQLException> {
4242
props);
4343
}
4444

45-
JdbcClientPool(int poolSize, String dbUrl, Map<String, String> props) {
45+
public JdbcClientPool(int poolSize, String dbUrl, Map<String, String> props) {
4646
super(poolSize, SQLNonTransientConnectionException.class, true);
4747
properties = props;
4848
this.dbUrl = dbUrl;

settings.gradle

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ include 'hive-metastore'
3434
include 'nessie'
3535
include 'gcp'
3636
include 'dell'
37+
include 'snowflake'
3738

3839
project(':api').name = 'iceberg-api'
3940
project(':common').name = 'iceberg-common'
@@ -51,6 +52,7 @@ project(':hive-metastore').name = 'iceberg-hive-metastore'
5152
project(':nessie').name = 'iceberg-nessie'
5253
project(':gcp').name = 'iceberg-gcp'
5354
project(':dell').name = 'iceberg-dell'
55+
project(':snowflake').name = 'iceberg-snowflake'
5456

5557
if (null != System.getProperty("allVersions")) {
5658
System.setProperty("flinkVersions", System.getProperty("knownFlinkVersions"))

0 commit comments

Comments
 (0)