[BUGFIX] Fix docs-snippets CI broken by sqlalchemy-redshift 1.0.0#11857
Conversation
The docs-snippets (docs-creds-needed) job started failing on every PR after sqlalchemy-redshift 1.0.0 hit PyPI on 2026-04-28. The previous release (0.8.14, from 2023) transitively pinned SQLAlchemy<2, which silently triggered a "skip under SQLA<2 + pandas>=2.2" block in test_script_runner and hid two latent bugs in the redshift/snowflake docs tests for months. With SQLA staying at 2.0.49, the tests now run and surface those bugs: 1. aws_redshift_deployment_patterns.py referenced TupleS3StoreBackend (removed in #11675). Drop the three S3-store config sections from the snippet — the class no longer exists and the snippet is not referenced from any current docs MDX. 2. The two snowflake docs tests called sa.create_engine(connection_string) directly. CI moved Snowflake auth to key-pair via SNOWFLAKE_PRIVATE_KEY but the docs-snippets path didn't wire it through, producing "251006: Password is empty". Add get_snowflake_connection_kwargs() in tests/test_utils.py and route load_data_into_test_database and clean_up_tables_with_prefix through it. Update add_datasource() to use the explicit-fields add_snowflake() overload when key-pair auth is in use, since the connection_string overload can't carry a private key. Also rename the misleading IS_RUNNING_SQLA_2_0_AND_PANDAS_2_2 variable to IS_RUNNING_SQLA_LT_2_AND_PANDAS_GTE_2_2 (the condition is sqla<2 AND pandas>=2.2) and audit the skip list. With the fixes above, the redshift and snowflake entries pass under SQLA 2; only expect_column_max_to_be_between_custom remains, since the to_sql/read_sql_table path it exercises is exactly what pandas 2.2 dropped SQLA<2 support for (#11417).
✅ Deploy Preview for niobium-lead-7998 canceled.
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #11857 +/- ##
===========================================
+ Coverage 83.90% 84.79% +0.89%
===========================================
Files 471 471
Lines 39168 39168
===========================================
+ Hits 32864 33213 +349
+ Misses 6304 5955 -349 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Fixes docs-snippets (docs-creds-needed) CI failures that surfaced after sqlalchemy-redshift==1.0.0 caused installs to resolve to SQLAlchemy 2.x, unmasking previously skipped Redshift + Snowflake docs-snippet integration tests.
Changes:
- Updates Snowflake docs-snippet test utilities to support key-pair auth by forwarding Snowflake-specific
connect_argsinto SQLAlchemy engine creation paths. - Cleans up and renames the SQLAlchemy<2 + pandas>=2.2 skip gate, removing now-unnecessary skipped docs tests.
- Removes deprecated S3 store configuration blocks from the Redshift deployment-pattern snippet that referenced deleted store backends.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| tests/test_utils.py | Adds Snowflake per-dialect engine kwargs to support key-pair auth in docs-snippet test database helpers and datasource creation. |
| tests/integration/test_script_runner.py | Renames/clarifies the SQLA<2 + pandas>=2.2 skip gate and narrows the skipped test list. |
| docs/docusaurus/docs/snippets/aws_redshift_deployment_patterns.py | Deletes snippet sections referencing removed TupleS3StoreBackend store config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| IS_RUNNING_SQLA_LT_2_AND_PANDAS_GTE_2_2 = ( | ||
| sqlalchemy.__version__ < "2.0" and pandas.__version__ >= "2.2" | ||
| ) |
There was a problem hiding this comment.
Version gating uses plain string comparisons (e.g., sqlalchemy.__version__ < "2.0"), which can produce incorrect results for versions like 2.10.0 (lexicographic vs semantic ordering). Use packaging.version.parse / packaging.version.Version (or sqlalchemy.util.compat helpers) to compare parsed versions instead of raw strings.
Summary
docs-snippets (docs-creds-needed)has been failing on every PR opened againstdevelopsince 2026-04-28 — three integration tests fail consistently:test_docs[deployment_patterns_redshift]—ClassInstantiationError: ExpectationsStore(real cause:AttributeError: module 'great_expectations.data_context.store' has no attribute 'TupleS3StoreBackend')test_docs[partition_data_on_whole_table_snowflake]—snowflake.connector.errors.ProgrammingError: 251006: Password is emptytest_docs[partition_data_on_datetime_snowflake]— same as aboveThis PR fixes both root causes and cleans up the misleading skip block that was masking them.
Why
The trigger was the
sqlalchemy-redshift==1.0.0release on 2026-04-28 (the previous release,0.8.14, was from 2023-04-07). Until then, the>=0.8.8floor inreqs/requirements-dev-redshift.txtresolved to0.8.14, which transitively pinned SQLAlchemy<2in the docs-creds-needed install. With the new1.0.0release, the install resolves tosqlalchemy-2.0.49.That mattered because
tests/integration/test_script_runner.py::_check_for_skipped_testshad a skip block for the (broken) combination of SQLA<2+ pandas>=2.2. With SQLA pinned to 1.4 by the redshift transitive, the skip silently fired and the redshift + snowflake docs tests didn't run for months. With SQLA back at 2, the skip stops firing — and the tests run for the first time, exposing two latent bugs that had accumulated underneath:TupleS3StoreBackendbut did not updatedocs/docusaurus/docs/snippets/aws_redshift_deployment_patterns.py, which still referenced it.SNOWFLAKE_PRIVATE_KEY) but the docs-snippets code path was never updated to pass the key intoconnect_args, so the connector saw an empty password.Changes
1. Drop the S3-store config sections from the redshift snippet.
docs/docusaurus/docs/snippets/aws_redshift_deployment_patterns.pyhad three sections that demonstrated configuring an S3-backed Expectations store, Validation Results store, and Data Docs site. All three referenceTupleS3StoreBackend, which was deleted along with the rest oftuple_store_backend.pyin #11675 — the class no longer exists in OSS GX.Verified that the only references to these snippet IDs (
new_expectations_store,new_validation_results_store,set_new_validation_results_store,add_data_docs_store) are insidedocs/docusaurus/versioned_docs/version-0.18/, which is a frozen snapshot and is unaffected. No current MDX file embeds them, so removing them is safe.2. Wire Snowflake key-pair auth through the docs-snippets test path.
In
tests/test_utils.py:get_snowflake_connection_kwargs()that returns{"connect_args": {"private_key": <base64 string>}}whenSNOWFLAKE_PRIVATE_KEYis set, else{}. Designed to be unpacked intosa.create_engine(connection_string, **kwargs)._engine_kwargs_for(connection_string)helper that picks the right kwargs based on the URL dialect.load_data_into_test_database(the function the failing tests trip over first) andclean_up_tables_with_prefixthrough it.add_datasource()so that whenSNOWFLAKE_PRIVATE_KEYis set it switches to the explicit-fields overload ofadd_snowflake()(passingaccount,user,private_key,database,schema,warehouse,rolefrom env). Theconnection_stringoverload ofadd_snowflake()can't carry a private key, so the connection-string path can't be used for key-pair auth.3. Rename and audit the skip block in
tests/integration/test_script_runner.py.IS_RUNNING_SQLA_2_0_AND_PANDAS_2_2→IS_RUNNING_SQLA_LT_2_AND_PANDAS_GTE_2_2(and the matching list constant). The condition issqla < "2.0" and pandas >= "2.2"; the old name suggested the opposite.partition_data_on_whole_table_snowflake,partition_data_on_datetime_snowflake— fixed by (2). Removed from skip list.partition_data_on_whole_table_redshift,partition_data_on_datetime_redshift— already pass under SQLA 2 (visible in the latest CI run); no longer need the skip. Removed.deployment_patterns_redshift— fixed by (1). Removed.expect_column_max_to_be_between_custom— kept. It exercisesSqlAlchemyExecutionEnginevia pandasto_sql/read_sql_table, which is the exact code path pandas 2.2 dropped SQLA<2 compat for. The original rationale from [MAINTENANCE] Skip tests for SQLA < 2 and Pandas >= 2.2 #11417 still applies if anyone runs that combo.Note: under current CI dependency resolution, no environment actually hits the SQLA<2 + pandas>=2.2 combination (
py3{10,11}-min-versionspin pandas==1.4 alongside SQLA<2; everything else is SQLA 2 + pandas 2.3). The skip block is effectively dead today, but leaving it in keeps the historical guard-rail at zero cost. Removing it entirely is a separate cleanup.User impact
None to end users. The redshift snippet was a contributor-facing integration-test fixture; the version of the page rendered on docs.greatexpectations.io is the frozen
version-0.18copy and is unchanged.How to review
expectations_S3_store,validation_results_S3_store, and theS3_sitedata-docs entry. What remains is the original "default-store sanity check" plus the redshift datasource setup.add_snowflake()overload accepts the env vars we read (account/user/private_key/database/schema/warehouse/role). Theprivate_keyoverload signature issources.pyi:696-712.docs-snippets (docs-creds-needed)should go from failing to passing. The other two job variants (docs-basic,docs-spark) were already green and shouldn't be affected.Closes the breakage on PRs #11854, #11855, #11856 and any other open PR currently red on
docs-snippets (docs-creds-needed).