Skip to content

[BUGFIX] Fix docs-snippets CI broken by sqlalchemy-redshift 1.0.0#11857

Merged
joshua-stauffer merged 2 commits into
developfrom
m/docs-snippets/fix-redshift-snowflake-after-skip-broke
Apr 30, 2026
Merged

[BUGFIX] Fix docs-snippets CI broken by sqlalchemy-redshift 1.0.0#11857
joshua-stauffer merged 2 commits into
developfrom
m/docs-snippets/fix-redshift-snowflake-after-skip-broke

Conversation

@joshua-stauffer
Copy link
Copy Markdown
Member

Summary

docs-snippets (docs-creds-needed) has been failing on every PR opened against develop since 2026-04-28 — three integration tests fail consistently:

  • test_docs[deployment_patterns_redshift]ClassInstantiationError: ExpectationsStore (real cause: AttributeError: module 'great_expectations.data_context.store' has no attribute 'TupleS3StoreBackend')
  • test_docs[partition_data_on_whole_table_snowflake]snowflake.connector.errors.ProgrammingError: 251006: Password is empty
  • test_docs[partition_data_on_datetime_snowflake] — same as above

This PR fixes both root causes and cleans up the misleading skip block that was masking them.

Why

The trigger was the sqlalchemy-redshift==1.0.0 release on 2026-04-28 (the previous release, 0.8.14, was from 2023-04-07). Until then, the >=0.8.8 floor in reqs/requirements-dev-redshift.txt resolved to 0.8.14, which transitively pinned SQLAlchemy <2 in the docs-creds-needed install. With the new 1.0.0 release, the install resolves to sqlalchemy-2.0.49.

That mattered because tests/integration/test_script_runner.py::_check_for_skipped_tests had a skip block for the (broken) combination of SQLA <2 + pandas >=2.2. With SQLA pinned to 1.4 by the redshift transitive, the skip silently fired and the redshift + snowflake docs tests didn't run for months. With SQLA back at 2, the skip stops firing — and the tests run for the first time, exposing two latent bugs that had accumulated underneath:

  1. [MAINTENANCE] remove deprecated store backends #11675 ("[MAINTENANCE] remove deprecated store backends", merged 2026-02-26) deleted TupleS3StoreBackend but did not update docs/docusaurus/docs/snippets/aws_redshift_deployment_patterns.py, which still referenced it.
  2. CI rotated Snowflake auth to key-pair (SNOWFLAKE_PRIVATE_KEY) but the docs-snippets code path was never updated to pass the key into connect_args, so the connector saw an empty password.

Changes

1. Drop the S3-store config sections from the redshift snippet.

docs/docusaurus/docs/snippets/aws_redshift_deployment_patterns.py had three sections that demonstrated configuring an S3-backed Expectations store, Validation Results store, and Data Docs site. All three reference TupleS3StoreBackend, which was deleted along with the rest of tuple_store_backend.py in #11675 — the class no longer exists in OSS GX.

Verified that the only references to these snippet IDs (new_expectations_store, new_validation_results_store, set_new_validation_results_store, add_data_docs_store) are inside docs/docusaurus/versioned_docs/version-0.18/, which is a frozen snapshot and is unaffected. No current MDX file embeds them, so removing them is safe.

2. Wire Snowflake key-pair auth through the docs-snippets test path.

In tests/test_utils.py:

  • Added get_snowflake_connection_kwargs() that returns {"connect_args": {"private_key": <base64 string>}} when SNOWFLAKE_PRIVATE_KEY is set, else {}. Designed to be unpacked into sa.create_engine(connection_string, **kwargs).
  • Added a small _engine_kwargs_for(connection_string) helper that picks the right kwargs based on the URL dialect.
  • Routed load_data_into_test_database (the function the failing tests trip over first) and clean_up_tables_with_prefix through it.
  • Updated add_datasource() so that when SNOWFLAKE_PRIVATE_KEY is set it switches to the explicit-fields overload of add_snowflake() (passing account, user, private_key, database, schema, warehouse, role from env). The connection_string overload of add_snowflake() can't carry a private key, so the connection-string path can't be used for key-pair auth.

3. Rename and audit the skip block in tests/integration/test_script_runner.py.

  • Renamed IS_RUNNING_SQLA_2_0_AND_PANDAS_2_2IS_RUNNING_SQLA_LT_2_AND_PANDAS_GTE_2_2 (and the matching list constant). The condition is sqla < "2.0" and pandas >= "2.2"; the old name suggested the opposite.
  • Updated the skip message to "requires sqlalchemy >= 2 when running pandas >= 2.2", which matches the actual gate.
  • Audited the six entries that were being skipped:
    • partition_data_on_whole_table_snowflake, partition_data_on_datetime_snowflake — fixed by (2). Removed from skip list.
    • partition_data_on_whole_table_redshift, partition_data_on_datetime_redshift — already pass under SQLA 2 (visible in the latest CI run); no longer need the skip. Removed.
    • deployment_patterns_redshift — fixed by (1). Removed.
    • expect_column_max_to_be_between_custom — kept. It exercises SqlAlchemyExecutionEngine via pandas to_sql/read_sql_table, which is the exact code path pandas 2.2 dropped SQLA<2 compat for. The original rationale from [MAINTENANCE] Skip tests for SQLA < 2 and Pandas >= 2.2 #11417 still applies if anyone runs that combo.

Note: under current CI dependency resolution, no environment actually hits the SQLA<2 + pandas>=2.2 combination (py3{10,11}-min-versions pin pandas==1.4 alongside SQLA<2; everything else is SQLA 2 + pandas 2.3). The skip block is effectively dead today, but leaving it in keeps the historical guard-rail at zero cost. Removing it entirely is a separate cleanup.

User impact

None to end users. The redshift snippet was a contributor-facing integration-test fixture; the version of the page rendered on docs.greatexpectations.io is the frozen version-0.18 copy and is unchanged.

How to review

  • The most surprising change is the redshift snippet diff (~150 lines deleted). The deletion is mechanical: the three blocks that built expectations_S3_store, validation_results_S3_store, and the S3_site data-docs entry. What remains is the original "default-store sanity check" plus the redshift datasource setup.
  • For the snowflake change, the main thing to confirm is that the explicit-fields add_snowflake() overload accepts the env vars we read (account/user/private_key/database/schema/warehouse/role). The private_key overload signature is sources.pyi:696-712.
  • After this lands, docs-snippets (docs-creds-needed) should go from failing to passing. The other two job variants (docs-basic, docs-spark) were already green and shouldn't be affected.

Closes the breakage on PRs #11854, #11855, #11856 and any other open PR currently red on docs-snippets (docs-creds-needed).

The docs-snippets (docs-creds-needed) job started failing on every PR after
sqlalchemy-redshift 1.0.0 hit PyPI on 2026-04-28. The previous release
(0.8.14, from 2023) transitively pinned SQLAlchemy<2, which silently
triggered a "skip under SQLA<2 + pandas>=2.2" block in test_script_runner
and hid two latent bugs in the redshift/snowflake docs tests for months.
With SQLA staying at 2.0.49, the tests now run and surface those bugs:

1. aws_redshift_deployment_patterns.py referenced TupleS3StoreBackend
   (removed in #11675). Drop the three S3-store config sections from
   the snippet — the class no longer exists and the snippet is not
   referenced from any current docs MDX.

2. The two snowflake docs tests called sa.create_engine(connection_string)
   directly. CI moved Snowflake auth to key-pair via SNOWFLAKE_PRIVATE_KEY
   but the docs-snippets path didn't wire it through, producing
   "251006: Password is empty". Add get_snowflake_connection_kwargs() in
   tests/test_utils.py and route load_data_into_test_database and
   clean_up_tables_with_prefix through it. Update add_datasource() to use
   the explicit-fields add_snowflake() overload when key-pair auth is in
   use, since the connection_string overload can't carry a private key.

Also rename the misleading IS_RUNNING_SQLA_2_0_AND_PANDAS_2_2 variable to
IS_RUNNING_SQLA_LT_2_AND_PANDAS_GTE_2_2 (the condition is sqla<2 AND
pandas>=2.2) and audit the skip list. With the fixes above, the redshift
and snowflake entries pass under SQLA 2; only expect_column_max_to_be_between_custom
remains, since the to_sql/read_sql_table path it exercises is exactly
what pandas 2.2 dropped SQLA<2 support for (#11417).
Copilot AI review requested due to automatic review settings April 30, 2026 12:07
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 30, 2026

Deploy Preview for niobium-lead-7998 canceled.

Name Link
🔨 Latest commit f7d9582
🔍 Latest deploy log https://app.netlify.com/projects/niobium-lead-7998/deploys/69f34c45a34444000863f940

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.79%. Comparing base (a294ae6) to head (f7d9582).
⚠️ Report is 1 commits behind head on develop.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop   #11857      +/-   ##
===========================================
+ Coverage    83.90%   84.79%   +0.89%     
===========================================
  Files          471      471              
  Lines        39168    39168              
===========================================
+ Hits         32864    33213     +349     
+ Misses        6304     5955     -349     
Flag Coverage Δ
3.10 73.56% <ø> (ø)
3.10 athena ?
3.10 aws_deps ?
3.10 big ?
3.10 bigquery ?
3.10 clickhouse ?
3.10 databricks ?
3.10 filesystem ?
3.10 mysql ?
3.10 openpyxl or pyarrow or project or sqlite or aws_creds ?
3.10 postgresql ?
3.10 singlestore ?
3.10 snowflake ?
3.10 spark ?
3.10 spark_connect ?
3.10 sql_server ?
3.10 trino ?
3.11 73.60% <ø> (ø)
3.11 athena ?
3.11 aws_deps ?
3.11 big ?
3.11 bigquery ?
3.11 clickhouse ?
3.11 databricks ?
3.11 filesystem ?
3.11 mysql ?
3.11 openpyxl or pyarrow or project or sqlite or aws_creds ?
3.11 postgresql ?
3.11 singlestore ?
3.11 snowflake ?
3.11 spark ?
3.11 spark_connect ?
3.11 sql_server ?
3.11 trino ?
3.12 73.61% <ø> (+0.01%) ⬆️
3.12 athena ?
3.12 aws_deps ?
3.12 big ?
3.12 bigquery ?
3.12 databricks ?
3.12 filesystem ?
3.12 mysql ?
3.12 openpyxl or pyarrow or project or sqlite or aws_creds ?
3.12 postgresql ?
3.12 singlestore ?
3.12 snowflake ?
3.12 spark ?
3.12 spark_connect ?
3.12 sql_server ?
3.12 trino ?
3.13 73.61% <ø> (+0.01%) ⬆️
3.13 athena 41.85% <ø> (ø)
3.13 aws_deps 45.11% <ø> (ø)
3.13 big 55.19% <ø> (ø)
3.13 bigquery 51.19% <ø> (ø)
3.13 clickhouse 41.86% <ø> (ø)
3.13 databricks 52.99% <ø> (ø)
3.13 filesystem 64.30% <ø> (ø)
3.13 gx-redshift 51.34% <ø> (ø)
3.13 mysql 51.73% <ø> (ø)
3.13 openpyxl or pyarrow or project or sqlite or aws_creds 59.89% <ø> (ø)
3.13 postgresql 55.15% <ø> (ø)
3.13 singlestore 47.01% <ø> (ø)
3.13 snowflake 53.83% <ø> (-0.01%) ⬇️
3.13 spark 55.97% <ø> (ø)
3.13 spark_connect 46.77% <ø> (ø)
3.13 sql_server 53.15% <ø> (ø)
3.13 trino 48.66% <ø> (ø)
cloud 0.00% <ø> (ø)
docs-basic 59.43% <ø> (?)
docs-creds-needed 60.61% <ø> (?)
docs-spark 57.49% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes docs-snippets (docs-creds-needed) CI failures that surfaced after sqlalchemy-redshift==1.0.0 caused installs to resolve to SQLAlchemy 2.x, unmasking previously skipped Redshift + Snowflake docs-snippet integration tests.

Changes:

  • Updates Snowflake docs-snippet test utilities to support key-pair auth by forwarding Snowflake-specific connect_args into SQLAlchemy engine creation paths.
  • Cleans up and renames the SQLAlchemy<2 + pandas>=2.2 skip gate, removing now-unnecessary skipped docs tests.
  • Removes deprecated S3 store configuration blocks from the Redshift deployment-pattern snippet that referenced deleted store backends.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/test_utils.py Adds Snowflake per-dialect engine kwargs to support key-pair auth in docs-snippet test database helpers and datasource creation.
tests/integration/test_script_runner.py Renames/clarifies the SQLA<2 + pandas>=2.2 skip gate and narrows the skipped test list.
docs/docusaurus/docs/snippets/aws_redshift_deployment_patterns.py Deletes snippet sections referencing removed TupleS3StoreBackend store config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +508 to 510
IS_RUNNING_SQLA_LT_2_AND_PANDAS_GTE_2_2 = (
sqlalchemy.__version__ < "2.0" and pandas.__version__ >= "2.2"
)
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version gating uses plain string comparisons (e.g., sqlalchemy.__version__ < "2.0"), which can produce incorrect results for versions like 2.10.0 (lexicographic vs semantic ordering). Use packaging.version.parse / packaging.version.Version (or sqlalchemy.util.compat helpers) to compare parsed versions instead of raw strings.

Copilot uses AI. Check for mistakes.
@joshua-stauffer joshua-stauffer merged commit c004117 into develop Apr 30, 2026
64 checks passed
@joshua-stauffer joshua-stauffer deleted the m/docs-snippets/fix-redshift-snowflake-after-skip-broke branch April 30, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants