[MAINTENANCE] Extract type comparison logic into dedicated module by wookasz · Pull Request #11798 · fivetran/great_expectations

wookasz · 2026-04-09T18:27:17Z

Summary

Extracts dialect-aware type comparison logic from ExpectColumnValuesToBeOfType and ExpectColumnValuesToBeInTypeList into a new type_comparison module at great_expectations/expectations/type_comparison.py
Public API: compare_column_type(), compare_column_type_list(), native_type_type_map(), and CASE_INSENSITIVE_DIALECTS constant
Both expectations now delegate SQLAlchemy validation to the module, reducing _validate_sqlalchemy() to 4 lines each
All consumers (metric modules, metrics/util.py) import directly from type_comparison — no re-exports
Eliminates the duplicated case_insensitive_dialects set in metrics/util.py
846 unit tests covering all dialect-specific type comparisons for PostgreSQL, Snowflake, Databricks, SQL Server, Trino, SQLite, and MySQL — both scalar and list functions tested independently per dialect

Context

This is Phase 1 of a two-phase effort to reduce Databricks/Snowflake CI runtime. By extracting and unit-testing the type comparison logic, Phase 2 can safely narrow most expectation integration tests to Postgres-only, since dialect-specific behavior is now covered by unit tests and metric-level integration tests.

Test plan

New unit tests pass: pytest tests/expectations/test_type_comparison.py
Existing unit tests unchanged: pytest tests/expectations/core/test_expect_column_values_to_be_of_type.py tests/expectations/core/test_expect_column_values_to_be_in_type_list.py
Ruff passes on all changed files
CI unit tests pass across all Python versions (3.10-3.13)
CI static-analysis (mypy) passes

Move dialect-aware type comparison logic from the two type expectations into a new `type_comparison` module with a clean public API: - `compare_column_type()` and `compare_column_type_list()` handle dialect dispatch internally (case-insensitive string comparison for 5 dialects, isinstance-based class matching for all others) - `native_type_type_map()` for pandas type resolution - `CASE_INSENSITIVE_DIALECTS` constant (eliminates duplicate in metrics/util.py) Both expectations now delegate to the module, and all consumers import directly from type_comparison rather than from the expectation files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

netlify · 2026-04-09T18:27:24Z

✅ Deploy Preview for niobium-lead-7998 canceled.

Name	Link
🔨 Latest commit	`a74796e`
🔍 Latest deploy log	https://app.netlify.com/projects/niobium-lead-7998/deploys/69d964ee2a02e0000852db27

for more information, see https://pre-commit.ci

codecov · 2026-04-09T18:29:42Z

Codecov Report

❌ Patch coverage is 82.11921% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.68%. Comparing base (160f568) to head (a74796e).
⚠️ Report is 67 commits behind head on develop.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
great_expectations/expectations/type_comparison.py	82.57%	23 Missing ⚠️
...ns/core/expect_column_values_to_be_in_type_list.py	66.66%	2 Missing ⚠️
...tations/core/expect_column_values_to_be_of_type.py	71.42%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop   #11798      +/-   ##
===========================================
+ Coverage    84.67%   84.68%   +0.01%     
===========================================
  Files          471      472       +1     
  Lines        39179    39199      +20     
===========================================
+ Hits         33174    33197      +23     
+ Misses        6005     6002       -3

Flag	Coverage Δ
3.10	`73.72% <72.84%> (+0.13%)`	⬆️
3.10 athena	`?`
3.10 aws_deps	`?`
3.10 big	`?`
3.10 clickhouse	`?`
3.10 filesystem	`?`
3.10 mysql	`?`
3.10 openpyxl or pyarrow or project or sqlite or aws_creds	`?`
3.10 postgresql	`?`
3.10 spark	`?`
3.10 spark_connect	`?`
3.10 sql_server	`?`
3.10 trino	`?`
3.11	`73.76% <72.84%> (+0.13%)`	⬆️
3.11 athena	`?`
3.11 aws_deps	`?`
3.11 big	`?`
3.11 clickhouse	`?`
3.11 filesystem	`?`
3.11 mysql	`?`
3.11 openpyxl or pyarrow or project or sqlite or aws_creds	`?`
3.11 postgresql	`?`
3.11 spark_connect	`?`
3.11 sql_server	`?`
3.11 trino	`?`
3.12	`73.77% <72.84%> (+0.14%)`	⬆️
3.12 athena	`?`
3.12 aws_deps	`?`
3.12 big	`?`
3.12 filesystem	`?`
3.12 mysql	`?`
3.12 openpyxl or pyarrow or project or sqlite or aws_creds	`?`
3.12 postgresql	`?`
3.12 spark	`?`
3.12 spark_connect	`?`
3.12 sql_server	`?`
3.12 trino	`?`
3.13	`73.77% <72.84%> (+0.13%)`	⬆️
3.13 athena	`41.93% <23.17%> (+<0.01%)`	⬆️
3.13 aws_deps	`45.18% <23.17%> (+<0.01%)`	⬆️
3.13 big	`55.26% <32.45%> (-0.01%)`	⬇️
3.13 bigquery	`51.25% <52.31%> (-0.01%)`	⬇️
3.13 clickhouse	`41.94% <21.85%> (+<0.01%)`	⬆️
3.13 databricks	`53.07% <33.77%> (+0.01%)`	⬆️
3.13 filesystem	`64.36% <33.77%> (-0.01%)`	⬇️
3.13 gx-redshift	`51.40% <52.98%> (-0.01%)`	⬇️
3.13 mysql	`51.80% <50.99%> (-0.01%)`	⬇️
3.13 openpyxl or pyarrow or project or sqlite or aws_creds	`59.96% <50.99%> (-0.01%)`	⬇️
3.13 postgresql	`55.23% <33.77%> (+0.01%)`	⬆️
3.13 snowflake	`53.90% <33.77%> (+<0.01%)`	⬆️
3.13 spark	`55.91% <24.50%> (-0.01%)`	⬇️
3.13 spark_connect	`46.85% <23.17%> (+<0.01%)`	⬆️
3.13 sql_server	`53.24% <33.77%> (+0.01%)`	⬆️
3.13 trino	`48.74% <24.50%> (-0.01%)`	⬇️
cloud	`0.00% <0.00%> (ø)`
docs-basic	`59.52% <33.77%> (+<0.01%)`	⬆️
docs-creds-needed	`58.10% <24.50%> (-0.01%)`	⬇️
docs-spark	`57.56% <24.50%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot

Pull request overview

This PR extracts dialect-aware SQLAlchemy column-type comparison logic (used by ExpectColumnValuesToBeOfType and ExpectColumnValuesToBeInTypeList) into a new type_comparison module, and updates expectations/metrics to delegate to that shared implementation.

Changes:

Added great_expectations.expectations.core.type_comparison with shared helpers (compare_column_type, compare_column_type_list, native_type_type_map, CASE_INSENSITIVE_DIALECTS).
Updated both type expectations and relevant column-map metrics to use the new module (removing duplicated/embedded logic).
Updated metrics column metadata handling to reuse the shared CASE_INSENSITIVE_DIALECTS constant and added a new unit test suite for the module.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/expectations/core/test_type_comparison.py	Adds unit tests for the new type comparison module.
great_expectations/expectations/metrics/util.py	Reuses shared `CASE_INSENSITIVE_DIALECTS` instead of a local duplicate.
great_expectations/expectations/metrics/column_map_metrics/column_values_of_type.py	Switches pandas native type mapping to `native_type_type_map()`.
great_expectations/expectations/metrics/column_map_metrics/column_values_in_type_list.py	Switches pandas native type mapping to `native_type_type_map()`.
great_expectations/expectations/core/type_comparison.py	New consolidated module implementing dialect-aware type comparisons.
great_expectations/expectations/core/expect_column_values_to_be_of_type.py	Delegates SQLAlchemy type comparison and native type mapping to `type_comparison`.
great_expectations/expectations/core/expect_column_values_to_be_in_type_list.py	Delegates SQLAlchemy type-list comparison and native type mapping to `type_comparison`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T18:31:12Z

+        # CaseInsensitiveString objects will automatically do case-insensitive comparison
+        return actual_column_type == expected_type
+    return str(actual_column_type).lower() == expected_type.lower()


_compare_type_string() claims to do case-insensitive comparison, but for plain str it currently uses actual_column_type == expected_type (case-sensitive). This is inconsistent with compare_column_type_list() (which lowercases) and with the module/docstring intent. Consider normalizing both sides (e.g., casefold()/lower()) for plain strings while still allowing CaseInsensitiveString.__eq__ to handle its own semantics.

Suggested change

# CaseInsensitiveString objects will automatically do case-insensitive comparison

return actual_column_type == expected_type

return str(actual_column_type).lower() == expected_type.lower()

# Preserve custom __eq__ behavior for str subclasses such as

# CaseInsensitiveString, but normalize plain str values explicitly.

if type(actual_column_type) is str:

return actual_column_type.casefold() == expected_type.casefold()

return actual_column_type == expected_type

return str(actual_column_type).casefold() == expected_type.casefold()

Copilot · 2026-04-09T18:31:13Z

+    execution_engine: SqlAlchemyExecutionEngine,
+) -> ModuleType:
+    if execution_engine.dialect_module is None:
+        logger.warning("No sqlalchemy dialect found; relying in top-level sqlalchemy types.")


Log message has a grammatical typo: "relying in top-level sqlalchemy types" should be "relying on top-level sqlalchemy types".

Suggested change

logger.warning("No sqlalchemy dialect found; relying in top-level sqlalchemy types.")

logger.warning("No sqlalchemy dialect found; relying on top-level sqlalchemy types.")

Copilot · 2026-04-09T18:31:13Z

+def _mock_engine(mocker, dialect_name):
+    """Create a mock SqlAlchemyExecutionEngine with the given dialect_name."""
+    engine = mocker.MagicMock()
+    type(engine).dialect_name = mocker.PropertyMock(return_value=dialect_name)


_mock_engine() sets type(engine).dialect_name using PropertyMock, which mutates the shared MagicMock class and can leak between tests. Prefer setting engine.dialect_name = <value> on the instance, or use a small stub class/object with a dialect_name attribute to avoid global side effects.

Suggested change

type(engine).dialect_name = mocker.PropertyMock(return_value=dialect_name)

engine.dialect_name = dialect_name

Copilot · 2026-04-09T18:31:13Z

+        mock_type = mocker.MagicMock()
+        mock_type.__str__ = lambda self: "INTEGER"
+        type(mock_type).__name__ = "INTEGER"


These tests mutate the MagicMock class by assigning type(mock_type).__name__ = "INTEGER", which can affect other mocks and make the suite order-dependent. Use a dedicated dummy type/class instance (e.g., a small local class with __name__ and __str__) instead of changing the global mock class metadata.

Suggested change

mock_type = mocker.MagicMock()

mock_type.__str__ = lambda self: "INTEGER"

type(mock_type).__name__ = "INTEGER"

class INTEGER:

def __str__(self):

return "INTEGER"

mock_type = INTEGER()

Copilot · 2026-04-09T18:31:14Z

+    @pytest.mark.parametrize("dialect", list(CASE_INSENSITIVE_DIALECTS))
+    def test_case_insensitive_match(self, mocker, dialect):
+        engine = _mock_engine(mocker, dialect)
+        success, _observed = compare_column_type(engine, "integer", "integer")


test_case_insensitive_match doesn't currently assert mixed-case behavior (it compares "integer" to "integer"). To verify the case-insensitive path, use differing cases (e.g., actual "INTEGER" vs expected "integer") and/or use CaseInsensitiveString as the observed value since that's what production code wraps types with for these dialects.

Suggested change

success, _observed = compare_column_type(engine, "integer", "integer")

success, _observed = compare_column_type(engine, "INTEGER", "integer")

Copilot · 2026-04-09T18:31:14Z

+
+# ---------------------------------------------------------------------------
+# compare_column_type_list — isinstance path
+# ---------------------------------------------------------------------------
+
+
+class TestCompareColumnTypeListByClass:
+    def test_match_in_list(self, mocker):
+        engine = _mock_engine(mocker, "sqlite")


Same issue as above: this test mutates type(mock_type).__name__, which can leak globally and cause flaky failures. Prefer a purpose-built dummy class/instance for the non-string type case.

- Fix _compare_type_string to use casefold() for plain str while preserving CaseInsensitiveString.__eq__ for subclasses - Fix typo: "relying in" -> "relying on" - Replace MagicMock with lightweight stub classes to avoid mutating shared mock class metadata between tests - Fix test_case_insensitive_match to actually test mixed case Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T15:06:51Z

+def native_type_type_map(type_: str) -> tuple | None:  # noqa: C901, PLR0911
+    """Map a string type name to a tuple of native Python types.
+
+    Used by pandas validation paths to resolve type names like "int", "str", etc.
+    Returns None for unrecognized types.
+    """
+    if type_.lower() == "none":
+        return (type(None),)
+    elif type_.lower() == "bool":
+        return (bool,)
+    elif type_.lower() in ["int", "long"]:
+        return (int,)
+    elif type_.lower() == "float":
+        return (float,)
+    elif type_.lower() == "bytes":
+        return (bytes,)
+    elif type_.lower() == "complex":
+        return (complex,)
+    elif type_.lower() in ["str", "string_types"]:
+        return (str,)
+    elif type_.lower() == "list":
+        return (list,)
+    elif type_.lower() == "dict":
+        return (dict,)
+    elif type_.lower() == "unicode":
+        return None
+


native_type_type_map() has an imprecise return type (tuple | None) and falls off the end implicitly for unknown types. Since this is part of the new public API, tighten the signature (e.g., tuple[type, ...] | None) and add an explicit return None for unrecognized values to make behavior and typing unambiguous.

Copilot · 2026-04-10T15:06:51Z

+                actual_column_type.lower() == expected_type.lower()
+                for expected_type in expected_types_list
+            )
+            return success, actual_column_type
+        else:
+            ret_type = type(actual_column_type).__name__
+            success = any(
+                ret_type.lower() == expected_type.lower() for expected_type in expected_types_list


Case-insensitive comparisons are implemented with .casefold() in _compare_type_string(), but compare_column_type_list() uses .lower(). Since this module is meant to centralize dialect-aware comparisons (and already notes Unicode correctness), it would be more consistent/robust to use .casefold() in the list path as well.

Suggested change

actual_column_type.lower() == expected_type.lower()

for expected_type in expected_types_list

)

return success, actual_column_type

else:

ret_type = type(actual_column_type).__name__

success = any(

ret_type.lower() == expected_type.lower() for expected_type in expected_types_list

actual_column_type.casefold() == expected_type.casefold()

for expected_type in expected_types_list

)

return success, actual_column_type

else:

ret_type = type(actual_column_type).__name__

success = any(

ret_type.casefold() == expected_type.casefold()

for expected_type in expected_types_list

Add comprehensive unit tests covering actual type names for each dialect: - PostgreSQL: CHAR, TEXT, VARCHAR, INTEGER, SMALLINT, BIGINT, TIMESTAMP, DATE, DOUBLE PRECISION, BOOLEAN, NUMERIC + case insensitivity - Snowflake: STRING, TEXT, CHARACTER, BYTEINT, TINYINT, INTEGER, BIGINT, FLOAT, DOUBLE, DECIMAL(38,0), FIXED, DEC, NUMBER, DATE, TIMESTAMP_NTZ, TIMESTAMP_LTZ, TIMESTAMP_TZ, TIME, VARIANT, VARBINARY, GEOGRAPHY, GEOMETRY - Databricks: STRING, INT, BIGINT, SMALLINT, TINYINT, BOOLEAN, FLOAT, DOUBLE, DECIMAL, DATE, TIMESTAMP, TIMESTAMP_NTZ - SQL Server: INTEGER, BIGINT, SMALLINT, FLOAT, REAL, NUMERIC, DECIMAL, VARCHAR, NVARCHAR, CHAR, BIT, DATE, DATETIME, DATETIME2 - Trino: INTEGER, BIGINT, SMALLINT, TINYINT, DOUBLE, DECIMAL, VARCHAR, CHAR, BOOLEAN, DATE, TIMESTAMP - SQLite: isinstance path with sa.types for all common types + list tests - MySQL: isinstance path with sa.types for all common types - CaseInsensitiveString: quoted vs unquoted behavior in both scalar and list comparison paths Total: 177 unit tests (up from 63). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

…ar and list Rewrite test suite to independently test both compare_column_type (scalar) and compare_column_type_list (list) for every dialect and type — no assumption that they share implementation. Dialect coverage (scalar + list for each type): - PostgreSQL: 19 types (CHAR through INTERVAL) - Snowflake: 29 types (STRING through BOOLEAN, incl DECIMAL variants) - Databricks: 17 types (STRING through STRUCT) - SQL Server: 27 types (INTEGER through XML) - Trino: 20 types (INTEGER through ROW) - SQLite: 14 types via sa.types isinstance path - MySQL: 14 types via sa.types isinstance path - CaseInsensitiveString: quoted/unquoted behavior tested independently for both scalar and list paths - Non-string type fallback: tested for all case-insensitive dialects Total: 831 unit tests (up from 177). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

The module serves both expectations/core/ and expectations/metrics/ equally — it's shared infrastructure, not specific to core. Moving it to expectations/type_comparison.py gives a cleaner import path and avoids metrics importing from expectations.core. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Test file now lives at tests/expectations/test_type_comparison.py, mirroring the module at expectations/type_comparison.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Sort CASE_INSENSITIVE_DIALECTS parametrize values for deterministic test ID ordering across pytest-xdist workers - Add explicit return None to native_type_type_map for mypy - Relax _get_redshift_sqlalchemy_types return type annotation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

…stently - Tighten native_type_type_map return type to tuple[type, ...] | None - Use casefold() instead of lower() in compare_column_type_list for consistency with the scalar path and better Unicode handling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…g dialect tests - Add comment explaining why CASE_INSENSITIVE_DIALECTS uses GXSqlDialect enum members (GXSqlDialect.__eq__ handles cross-type str comparison) - Add test_string_membership_via_gxsqldialect_eq to prove str 'in' checks work against the frozenset - Add TestStringDialectName class exercising both compare_column_type and compare_column_type_list with plain string dialect names (as returned by SqlAlchemyExecutionEngine.dialect_name in production) Total: 846 tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T16:11:57Z

+    import teradatasqlalchemy.dialect
+    import teradatasqlalchemy.types as teradatatypes
+except ImportError:
+    teradatasqlalchemy = None


In the teradatasqlalchemy import fallback, teradatatypes is not set when the import fails. _get_dialect_type_module() later references teradatatypes, which will raise NameError in environments without teradatasqlalchemy installed. Set teradatatypes = None in the except ImportError block (mirrors patterns used elsewhere in the repo).

Suggested change

teradatasqlalchemy = None

teradatasqlalchemy = None

teradatatypes = None

Copilot · 2026-04-10T16:11:57Z

+    def test_list_uses_lower_not_eq(self):
+        """The type-list path uses .lower() comparison, so quoted semantics


This test/docstring says the type-list path uses .lower() comparison, but the implementation in compare_column_type_list() now uses .casefold() for comparison. Please update the wording to match the current behavior (or adjust implementation if .lower() is intended).

Suggested change

def test_list_uses_lower_not_eq(self):

"""The type-list path uses .lower() comparison, so quoted semantics

def test_list_uses_casefold_not_eq(self):

"""The type-list path uses .casefold() comparison, so quoted semantics

Copilot · 2026-04-10T16:11:57Z

+"""Unit tests for great_expectations.expectations.type_comparison.
+
+Every dialect tests BOTH compare_column_type (scalar) and
+compare_column_type_list (list) independently — no assumption is made
+that they share an implementation.
+"""


PR description/test plan references pytest tests/expectations/core/test_type_comparison.py, but the added unit tests are in tests/expectations/test_type_comparison.py. Please update the PR description (or move/rename the test file) so the documented command matches the repo layout.

Guard teradatasqlalchemy and clickhouse_sqlalchemy with explicit 'is not None' checks before issubclass() calls so mypy doesn't see attribute access on potentially-None modules. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Set teradatatypes = None in import fallback to match clickhouse pattern and prevent potential NameError - Update test docstring from .lower() to .casefold() after consistency fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dialect_module is typed as Module but at runtime can be a dialect class for Teradata and ClickHouse. The try/except TypeError handles the case where it's not a class. This matches the original code's behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T16:30:46Z

+"""Unit tests for great_expectations.expectations.type_comparison.
+
+Every dialect tests BOTH compare_column_type (scalar) and
+compare_column_type_list (list) independently — no assumption is made
+that they share an implementation.
+"""


PR description/test plan references running pytest tests/expectations/core/test_type_comparison.py, but the new unit tests are added at tests/expectations/test_type_comparison.py. This mismatch will likely confuse CI/local runs; please either update the PR description/test plan or move the test file into the documented location (and adjust any conventions for this repo’s expectations tests accordingly).

Already fixed — the PR description was updated in a prior commit to reference the correct path (tests/expectations/test_type_comparison.py).

Copilot · 2026-04-10T16:30:46Z

    # Apply case-insensitive formatting for specific dialects
+    from great_expectations.expectations.type_comparison import CASE_INSENSITIVE_DIALECTS
+
    dialect_name = execution_engine.dialect.name
-    case_insensitive_dialects = {
-        GXSqlDialect.DATABRICKS,
-        GXSqlDialect.POSTGRESQL,
-        GXSqlDialect.SNOWFLAKE,
-        GXSqlDialect.SQL_SERVER,
-        GXSqlDialect.TRINO,
-    }
-    if dialect_name in case_insensitive_dialects:
+    if dialect_name in CASE_INSENSITIVE_DIALECTS:


Importing CASE_INSENSITIVE_DIALECTS from great_expectations.expectations.type_comparison here pulls in a module that performs optional third-party imports at import time (e.g., teradatasqlalchemy/clickhouse). Since this helper is on a hot path for column metadata reflection, consider relocating CASE_INSENSITIVE_DIALECTS to a lightweight module (e.g., sqlalchemy_dialect) or making those optional imports lazy inside the helper functions that require them, to avoid unnecessary import overhead/side effects.

The import in metrics/util.py is already deferred (inside the function body, not at module level), so it doesn't affect the hot path. The try/except imports of teradatasqlalchemy and clickhouse in type_comparison.py are cheap no-ops when packages aren't installed — they just set variables to None. Moving CASE_INSENSITIVE_DIALECTS to sqlalchemy_dialect.py would reduce coupling but would also separate the constant from the comparison logic it governs. Not worth the trade-off for this PR.

…tion Verify that both type expectations correctly delegate to the type_comparison module through their public validator API: ExpectColumnValuesToBeOfType: - isinstance path (SQLite native): success + failure - case-insensitive path (all 5 dialects): success + failure ExpectColumnValuesToBeInTypeList: - isinstance path (SQLite native): success + failure - case-insensitive path (all 5 dialects): success + failure These tests ensure a refactoring mistake that breaks the wiring between the expectations and compare_column_type/compare_column_type_list would be caught, without depending on the module's internal unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

for more information, see https://pre-commit.ci

…ed delegation tests The previous wiring tests proved end-to-end results were correct but didn't prove the expectations actually delegate to compare_column_type / compare_column_type_list. Someone could reimplement the logic inline and the tests would still pass. New tests monkeypatch the compare functions with controlled return values and assert: - The function was called exactly once - It received the correct execution_engine and expected_type arguments - The expectation result reflects the patched return value (success and observed_value), not independently computed logic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T19:40:06Z

+        # bigquery geography requires installing an extra package
+        if (
+            expected_type.lower() == "geography"
+            and execution_engine.engine.dialect.name.lower() == GXSqlDialect.BIGQUERY


The BigQuery dialect check compares a lowercase dialect name string to GXSqlDialect.BIGQUERY (an Enum). Since the string is on the left-hand side, this comparison will always be False and the GEOGRAPHY support warning branch will never run. Compare against GXSqlDialect.BIGQUERY.value (or invert the comparison / use execution_engine.dialect_name == GXSqlDialect.BIGQUERY) so the condition can be reached.

Suggested change

and execution_engine.engine.dialect.name.lower() == GXSqlDialect.BIGQUERY

and execution_engine.engine.dialect.name.lower() == GXSqlDialect.BIGQUERY.value

This comparison works correctly — Python's str.__eq__ returns NotImplemented for non-str types, which causes Python to fall back to GXSqlDialect.__eq__ on the right-hand side. Verified locally: 'bigquery' == GXSqlDialect.BIGQUERY evaluates to True. This is inherited code with the same behavior as the original.

Copilot · 2026-04-10T19:40:07Z

+        (success, observed_value) where observed_value is the raw type string
+        for case-insensitive dialects or type(actual_column_type).__name__ otherwise.
+    """
+    if execution_engine.dialect_name in CASE_INSENSITIVE_DIALECTS:
+        success = _compare_type_string(actual_column_type, expected_type)
+        return success, actual_column_type


For case-insensitive dialects, compare_column_type() always returns actual_column_type as observed_value even when it's not a string (the comparison falls back to str(actual_column_type) but the return value does not). This contradicts the docstring (“raw type string”) and can yield a non-serializable object in the expectation result. Consider returning str(actual_column_type) (or the computed normalized string) when actual_column_type is not a str/CaseInsensitiveString.

Suggested change

(success, observed_value) where observed_value is the raw type string

for case-insensitive dialects or type(actual_column_type).__name__ otherwise.

"""

if execution_engine.dialect_name in CASE_INSENSITIVE_DIALECTS:

success = _compare_type_string(actual_column_type, expected_type)

return success, actual_column_type

(success, observed_value) where observed_value is the type string

for case-insensitive dialects or type(actual_column_type).__name__ otherwise.

"""

if execution_engine.dialect_name in CASE_INSENSITIVE_DIALECTS:

success = _compare_type_string(actual_column_type, expected_type)

observed_value = (

actual_column_type

if isinstance(actual_column_type, str)

else str(actual_column_type)

)

return success, observed_value

Good catch on the docstring — fixed. The return value is actual_column_type as-is (typically a CaseInsensitiveString, which is a str subclass). The non-string fallback in _compare_type_string is a defensive path that shouldn't trigger in production since case-insensitive dialects always receive CaseInsensitiveString from get_sqlalchemy_column_metadata. Updated the docstring to be precise about what's returned.

…e return value Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace manual call tracking with mocker.patch and assert_called_once() for both expectations. Both success and failure tests now consistently: - Assert the compare function was called exactly once - Assert execution_engine and expected_type(s) arguments - Assert the expectation result reflects the mock's return value Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace assert_called_once + manual call_args inspection with assert_called_once_with(engine, mocker.ANY, expected_type) for cleaner, more idiomatic mock assertions. mocker.ANY matches the actual_column_type argument which is an internal SQLAlchemy type instance we don't control. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T21:06:07Z

+        if BigQueryDialect and (
+            isinstance(
+                execution_engine.dialect_module,
+                BigQueryDialect,
+            )
+            and bigquery_types_tuple is not None
+        ):
+            return bigquery_types_tuple
+    except (TypeError, AttributeError):


In _get_dialect_type_module(), this BigQuery branch can never be taken as written: BigQueryDialect is imported as the sqlalchemy_bigquery module (see great_expectations/compatibility/bigquery.py), so passing it as the 2nd arg to isinstance() raises TypeError and the condition always falls through. If you still need this patching logic, compare modules directly (e.g., execution_engine.dialect_module.__name__ == "sqlalchemy_bigquery") or import/check against the actual BigQuery dialect class instead of the module.

Suggested change

if BigQueryDialect and (

isinstance(

execution_engine.dialect_module,

BigQueryDialect,

)

and bigquery_types_tuple is not None

):

return bigquery_types_tuple

except (TypeError, AttributeError):

if (

BigQueryDialect

and bigquery_types_tuple is not None

and (

execution_engine.dialect_module is BigQueryDialect

or getattr(execution_engine.dialect_module, "__name__", None)

== getattr(BigQueryDialect, "__name__", None)

)

):

return bigquery_types_tuple

except AttributeError:

This is inherited code moved verbatim from expect_column_values_to_be_of_type.py. The try/except (TypeError, AttributeError) around each check is intentional — it handles the cases where dialect_module is a module rather than a class. Cleaning up these branches to use isinstance(dialect_module, type) guards or restructuring the dispatch is valid follow-up work but outside the scope of this extraction PR.

Copilot · 2026-04-10T21:06:07Z

+            and issubclass(
+                execution_engine.dialect_module,  # type: ignore[arg-type] # dialect_module can be a class


These Teradata checks appear to be dead code: SqlAlchemyExecutionEngine.dialect_module is set via import_library_module(...) and is therefore a ModuleType, but issubclass(execution_engine.dialect_module, TeradataDialect) will always raise TypeError and be ignored. If the intent is to switch to teradatatypes when the active dialect is Teradata, consider checking execution_engine.engine.dialect (instance) or execution_engine.dialect_module.__name__ instead of using issubclass() on the imported module.

Suggested change

and issubclass(

execution_engine.dialect_module, # type: ignore[arg-type] # dialect_module can be a class

and isinstance(

execution_engine.dialect,

This is inherited code moved verbatim from expect_column_values_to_be_of_type.py. The try/except (TypeError, AttributeError) around each check is intentional — it handles the cases where dialect_module is a module rather than a class. Cleaning up these branches to use isinstance(dialect_module, type) guards or restructuring the dispatch is valid follow-up work but outside the scope of this extraction PR.

Copilot · 2026-04-10T21:06:08Z

+            and issubclass(
+                execution_engine.dialect_module,  # type: ignore[arg-type] # dialect_module can be a class


Similarly, this ClickHouse branch uses issubclass(execution_engine.dialect_module, ClickHouseDialect) but dialect_module is imported as a module, not a class, so this will always TypeError and the ch_types module will never be selected. If you want to return clickhouse_sqlalchemy.types here, consider switching to a module-name check (or check execution_engine.engine.dialect against the dialect class).

Suggested change

and issubclass(

execution_engine.dialect_module, # type: ignore[arg-type] # dialect_module can be a class

and isinstance(

execution_engine.dialect,

This is inherited code moved verbatim from expect_column_values_to_be_of_type.py. The try/except (TypeError, AttributeError) around each check is intentional — it handles the cases where dialect_module is a module rather than a class. Cleaning up these branches to use isinstance(dialect_module, type) guards or restructuring the dispatch is valid follow-up work but outside the scope of this extraction PR.

github-actions · 2026-06-09T01:06:39Z

Is this PR still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity.

It will be closed if no further activity occurs. Thank you for your contributions 🙇

Copilot AI review requested due to automatic review settings April 9, 2026 18:27

[pre-commit.ci] auto fixes from pre-commit.com hooks

4cf6d34

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of wookasz April 9, 2026 18:27 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

wookasz and others added 2 commits April 9, 2026 20:25

[pre-commit.ci] auto fixes from pre-commit.com hooks

522c2c7

for more information, see https://pre-commit.ci

Copilot AI review requested due to automatic review settings April 10, 2026 03:26

wookasz removed the request for review from Copilot April 10, 2026 03:26

Merge branch 'develop' into m/extract-type-comparison-module

eb4c5d6

Copilot AI review requested due to automatic review settings April 10, 2026 14:59

Copilot started reviewing on behalf of wookasz April 10, 2026 15:00 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

wookasz and others added 2 commits April 10, 2026 08:14

[pre-commit.ci] auto fixes from pre-commit.com hooks

ee91779

for more information, see https://pre-commit.ci

Copilot AI review requested due to automatic review settings April 10, 2026 15:15

wookasz removed the request for review from Copilot April 10, 2026 15:15

Copilot AI review requested due to automatic review settings April 10, 2026 15:24

[pre-commit.ci] auto fixes from pre-commit.com hooks

2cc59e9

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of wookasz April 10, 2026 15:25 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

wookasz and others added 2 commits April 10, 2026 08:40

[MAINTENANCE] Move test_type_comparison.py to match module location

5a39b29

Test file now lives at tests/expectations/test_type_comparison.py, mirroring the module at expectations/type_comparison.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 10, 2026 15:49

[pre-commit.ci] auto fixes from pre-commit.com hooks

52edbcf

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of wookasz April 10, 2026 15:49 View session

wookasz and others added 2 commits April 10, 2026 08:57

Copilot AI review requested due to automatic review settings April 10, 2026 16:04

[pre-commit.ci] auto fixes from pre-commit.com hooks

745f6dd

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of wookasz April 10, 2026 16:05 View session

Trigger CI

45b796e

Copilot AI reviewed Apr 10, 2026

View reviewed changes

wookasz and others added 2 commits April 10, 2026 09:20

Copilot AI review requested due to automatic review settings April 10, 2026 16:23

Copilot started reviewing on behalf of wookasz April 10, 2026 16:24 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

wookasz and others added 2 commits April 10, 2026 09:47

[pre-commit.ci] auto fixes from pre-commit.com hooks

d260d8c

for more information, see https://pre-commit.ci

Copilot AI review requested due to automatic review settings April 10, 2026 16:47

wookasz removed the request for review from Copilot April 10, 2026 16:47

wookasz and others added 2 commits April 10, 2026 11:12

Trigger CI

4c6fb76

Copilot AI review requested due to automatic review settings April 10, 2026 19:35

Copilot started reviewing on behalf of wookasz April 10, 2026 19:35 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

wookasz and others added 2 commits April 10, 2026 13:54

[MAINTENANCE] Fix compare_column_type docstring to accurately describ…

af838d5

…e return value Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 10, 2026 20:58

Copilot started reviewing on behalf of wookasz April 10, 2026 20:59 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

github-actions Bot added the stale Stale issues and PRs label Jun 9, 2026

-        # CaseInsensitiveString objects will automatically do case-insensitive comparison
-        return actual_column_type == expected_type
-    return str(actual_column_type).lower() == expected_type.lower()
+        # Preserve custom __eq__ behavior for str subclasses such as
+        # CaseInsensitiveString, but normalize plain str values explicitly.
+        if type(actual_column_type) is str:
+            return actual_column_type.casefold() == expected_type.casefold()
+        return actual_column_type == expected_type
+    return str(actual_column_type).casefold() == expected_type.casefold()

	logger.warning("No sqlalchemy dialect found; relying in top-level sqlalchemy types.")
	logger.warning("No sqlalchemy dialect found; relying on top-level sqlalchemy types.")

	type(engine).dialect_name = mocker.PropertyMock(return_value=dialect_name)
	engine.dialect_name = dialect_name

-        mock_type = mocker.MagicMock()
-        mock_type.__str__ = lambda self: "INTEGER"
-        type(mock_type).__name__ = "INTEGER"
+        class INTEGER:
+            def __str__(self):
+                return "INTEGER"
+        mock_type = INTEGER()

	success, _observed = compare_column_type(engine, "integer", "integer")
	success, _observed = compare_column_type(engine, "INTEGER", "integer")

	teradatasqlalchemy = None
	teradatasqlalchemy = None
	teradatatypes = None

		def test_list_uses_lower_not_eq(self):
		"""The type-list path uses .lower() comparison, so quoted semantics

	and execution_engine.engine.dialect.name.lower() == GXSqlDialect.BIGQUERY
	and execution_engine.engine.dialect.name.lower() == GXSqlDialect.BIGQUERY.value

		and issubclass(
		execution_engine.dialect_module, # type: ignore[arg-type] # dialect_module can be a class

Conversation

wookasz commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Uh oh!

netlify Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for niobium-lead-7998 canceled.

Uh oh!

codecov Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

wookasz Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

wookasz Apr 10, 2026

Choose a reason for hiding this comment

wookasz commented Apr 9, 2026 •

edited

Loading

netlify Bot commented Apr 9, 2026 •

edited

Loading

codecov Bot commented Apr 9, 2026 •

edited

Loading