Skip to content

add-option-to-continue-on-script-error#339

Open
luisggc wants to merge 3 commits intoSnowflake-Labs:masterfrom
luisggc:add-option-to-continue-on-script-error
Open

add-option-to-continue-on-script-error#339
luisggc wants to merge 3 commits intoSnowflake-Labs:masterfrom
luisggc:add-option-to-continue-on-script-error

Conversation

@luisggc
Copy link

@luisggc luisggc commented Aug 2, 2025

Continue-on-Error Enhancements

Overview

  • Introduced optional flags to allow deployments to continue after script failures.
  • Full error messages are recorded and failed scripts are listed upon completion.
  • New flags are documented in the README so users can enable them via the CLI when needed.

Configuration

  • Added configuration fields to capture continue-on-error behavior.
  • Default behavior remains fail-fast unless explicitly enabled.
  • Exposed CLI options for:
    • repeatable scripts
    • always scripts
    • all scripts
  • --continue-all-on-error acts as shorthand for enabling all script-type options (except versioned).
  • If -- schemachange-no-jinja is present in the sql, skip the jinja parsing
Parameter Environment Variable Description
--continue-all-on-error Continue executing remaining repeatable and always scripts even if one fails. Versioned scripts always stop on failure (default: false)
--continue-repeatable-on-error Continue executing remaining repeatable scripts after an error (default: false)
--continue-always-on-error Continue executing remaining always scripts after an error (default: false)

Deployment Behavior

  • Deployment loop now:
    • Tallies failed scripts
    • Continues when configured
    • Reports a summary of failures before exiting

Testing

  • Deployment continues past a failing script when continue_versioned_on_error is set.
  • Failures are surfaced at the end of deployment.
  • Session logic:
    • Retries inserts
    • Ensures full error details are persisted

@zroytman
Copy link

zroytman commented Aug 3, 2025

Oh, I'm waiting for this enhancement so long time! It's great, thank you very much! How to speed it up to be moved to production (master)?

@zroytman
Copy link

zroytman commented Aug 3, 2025

@sfc-gh-tmathew Hi, who can approve the merge to master and when can it be done? Thanks a lot.

@luisggc
Copy link
Author

luisggc commented Aug 3, 2025

@zroytman , I don't think It will be reviewed anytime soon, apparently. There are a couple of PRs with no answer for many months. In the meantime I am using:
pip install "git+https://github.com/luisggc/schemachange.git@add-option-to-continue-on-script-error"

@zroytman
Copy link

zroytman commented Aug 4, 2025

@zroytman , I don't think It will be reviewed anytime soon, apparently. There are a couple of PRs with no answer for many months. In the meantime I am using: pip install "git+https://github.com/luisggc/schemachange.git@add-option-to-continue-on-script-error"

@luisggc do you know who should approve it? Who is the maintainer?

@luisggc
Copy link
Author

luisggc commented Aug 4, 2025

I think we have a few @sfc-gh-twhite merged a PR today, tough.
Please, help me understand what I can do in the PR to make It better.
I thought perhaps not adding the error msg to the table would make the code simpler.

@zroytman
Copy link

zroytman commented Aug 4, 2025

I think we have a few @sfc-gh-twhite merged a PR today, tough. Please, help me understand what I can do in the PR to make It better. I thought perhaps not adding the error msg to the table would make the code simpler.

Actually I've reviewed your list of changes and the explanation - it looks great for me. The writing of the error into the db table is not so mandatory for me in case you show the error on the screen during the execution - should be enough for me. But this is a great feature. Thank you so much! Hopefully it will be merged soon.

@sfc-gh-twhite
Copy link
Collaborator

Thanks for submitting this PR and for the review. I wanted to acknowledge that we will review this sometime this week. We're currently trying to close out a few other lingering things.

I agree that this is a useful feature and would like @sfc-gh-jhansen to review it as well to determine if it is something we should implement.

@luisggc luisggc force-pushed the add-option-to-continue-on-script-error branch from 356017e to 9cdb95b Compare August 5, 2025 01:52
@luisggc luisggc force-pushed the add-option-to-continue-on-script-error branch 2 times, most recently from 9acc4eb to 3d82ceb Compare August 5, 2025 13:08
@MACKAT05
Copy link
Contributor

MACKAT05 commented Aug 8, 2025

I wrote an adapter for the sqlfluff linter #342 that might preclude the need for this option by routing out syntax errors before deploy... also have you looked into dbt... parralelism and sequencing are handled there already...

@luisggc luisggc force-pushed the add-option-to-continue-on-script-error branch from 3d82ceb to 6487d71 Compare August 26, 2025 20:14
@zroytman
Copy link

Hi ALL. Any news regarding getting this awesome enhancement? Thanks.

@sfc-gh-tmathew sfc-gh-tmathew added the enhancement New feature or request label Aug 29, 2025
@sfc-gh-tmathew sfc-gh-tmathew self-assigned this Aug 29, 2025
@sfc-gh-tmathew
Copy link
Collaborator

@zroytman will take a look at this option and make the call by first week of september.

@zroytman
Copy link

@zroytman will take a look at this option and make the call by first week of september.

Hi @sfc-gh-tmathew , any updates? Thanks.

@zroytman
Copy link

@zroytman will take a look at this option and make the call by first week of september.

Hello @sfc-gh-tmathew , have you had a chance to look at this? Thanks.

@zroytman
Copy link

zroytman commented Nov 3, 2025

Hi All! Any updates regarding the approval? Thanks.

@sfc-gh-tmathew sfc-gh-tmathew added Under Review This is being discussed without planned changes community-contribution Submitted by community labels Nov 18, 2025
@zroytman
Copy link

I'm hopeless... any news regarding the implementation of the feature?

@sfc-gh-twhite
Copy link
Collaborator

@sfc-gh-jhansen @sfc-gh-tmathew - Is this functionality something we want to support for the general project?

@sfc-gh-jhansen
Copy link
Collaborator

Hey there @luisggc and @zroytman, thanks for your interest in this. Can you please help us understand the use case for this feature? In general, the point of imperative tools with dependency ordering (versioned scripts in schemachange) is to make sure that the scripts don't run out of order. So if a script failed, it seems like we would not want to continue. But I'm curious what the use case you have in mind is here.

@zroytman
Copy link

Hey there @luisggc and @zroytman, thanks for your interest in this. Can you please help us understand the use case for this feature? In general, the point of imperative tools with dependency ordering (versioned scripts in schemachange) is to make sure that the scripts don't run out of order. So if a script failed, it seems like we would not want to continue. But I'm curious what the use case you have in mind is here.

Hi @sfc-gh-jhansen . As I've described in #230 and #239, I'm using SchemaChange as part of Azure DevOps Pipeline deployment, where I'm trying to deploy several sql scripts. In most of the cases the scripts have no dependencies between each other, but SchemaChange stops deployment on the first failure and doesn't proceed with the other scripts.
Adding new parameter "Skip Failures" Y/N (True/False, default False as of today) will be helpful in that case, so the deployment process will deploy all the possible scripts without stopping after the first failure.
More than that, in case, for example, script A fails on missing object which should be created in script B (which runs after A), in current state A will fail, B will not run at all. Using this new parameter with skip failures script A will fail, script B will run and create the object, so we will be able to rerun the deployment again - next cycle of running will successfully deploy A, while B will not be deployed as it was deployed previously.

Thanks a lot.

@zroytman
Copy link

Hi @sfc-gh-jhansen , any news? Thanks.

@sfc-gh-tmathew
Copy link
Collaborator

sfc-gh-tmathew commented Feb 9, 2026

Hi @luisggc,

Thank you for this contribution! We've just released 4.3.0 which includes some related functionality, so I wanted to clarify the current state and what's still valuable from your PR.

What 4.3.0 Already Includes

  1. --out-of-order: Allows versioned scripts to run regardless of version ordering (useful for parallel development branches)
  2. Failed Script Logging: Script failures are now recorded in the change history table with STATUS = 'Failed'

What's Still Valuable from This PR

The continue-on-error for R and A scripts is still a requested feature (issues #239, #230) that we'd like to include. However, we have design requirements:

Design Constraints (from our spec)

Script Type Continue-on-Error Allowed?
V (Versioned) ❌ No - must always fail-fast
R (Repeatable) ✅ Yes
A (Always) ✅ Yes

Rationale: Versioned scripts have ordering dependencies - V2 may depend on V1 succeeding. The --out-of-order flag handles the "run older versions" use case, but skipping a failed V script would leave the database in an inconsistent state.

Required Changes

  1. Remove --continue-versioned-on-error - V scripts must always stop on failure. Users who want flexible V script handling should use --out-of-order instead.
  2. Keep --continue-repeatable-on-error and --continue-always-on-error - These are safe and useful.
  3. Rename --continue-all-on-error to only affect R and A scripts (or remove it and let users set both R and A flags explicitly).
  4. Exit code: Ensure non-zero exit code if ANY script failed (for CI/CD detection).
  5. Summary report: The failure summary at the end is great - please keep that.

Regarding --schemachange-no-jinja

This is a separate feature. Could you split it into its own PR? It's useful but unrelated to continue-on-error.

Next Steps

  1. This PR has merge conflicts - please rebase against master
  2. Remove/adjust the versioned script continue-on-error option per above
  3. Split out the --schemachange-no-jinja feature

Once updated, we'll target this for 4.4.0. Happy to discuss if you have questions!

@zroytman
Copy link

@sfc-gh-tmathew Thanks for the update. Although you said that you've released version 4.3.0, I see we're using version 4.0.1 (every run process we're pulling the latest version, so looks like 4.0.1 is the latest pullable version, correct me if I'm wrong).
Anyway will be glad if it's possible to include my requested change since we're using R scripts only, so the option "continue-on-script-error" is going to be VERY useful for us.
Thanks a lot.

@sfc-gh-tmathew
Copy link
Collaborator

You may have pinned your version to 4.0.1. Currently the latest is 4.3.1

image

Could you consider the changes requested and rebase to the latest commit on master branch so that we can review the changes to merge in 4.4.0?

@zroytman
Copy link

zroytman commented Feb 11, 2026

@sfc-gh-tmathew My fault... my Python version was too old. Just upgraded it and got the latest version of SchemaChange.
Regarding the requested changes (add-option-to-continue-on-script-error) - @luisggc could you please respond?
Thanks a lot.

@zroytman
Copy link

zroytman commented Feb 11, 2026

@sfc-gh-tmathew My fault... my Python version was too old. Just upgraded it and got the latest version of SchemaChange. Regarding the requested changes (add-option-to-continue-on-script-error) - @luisggc could you please respond? Thanks a lot.

@sfc-gh-tmathew But now i have another issue: for some reason upgrading of SchemaChange from 4.0.1 to 4.3.1 causes all the scripts to be re-deployed (tried it in dry-run mode). If i use back the old version 4.0.1 (again in dry-run mode), the scripts aren't being re-deployed. Why is it happening? Why upgrading of the SchemaChange is redeploying all the already deployed (existing in schemachange history table) scripts? Probably the reason is the difference between the checksum calculations of 4.0.1 and 4.3.1 - but how could it be?

@sfc-gh-tmathew
Copy link
Collaborator

#414 was reported after 4.3.0 was released and I thought we addressed this with 4.3.1 @zroytman . Will take a look again.

@sfc-gh-tmathew
Copy link
Collaborator

@zroytman Thanks for the detailed report! I've investigated and confirmed a new issue - this is separate from the #414 fix.

Opened issue #417 to track this:

Root Cause Identified

The checksum difference between 4.0.1 and 4.3.1 is caused by trailing semicolon handling changes, NOT the trailing comment fix:

Version Checksum Behavior
4.0.1 822b41a8... Trailing semicolon stripped before checksum
4.1.0 822b41a8... Same as 4.0.1
4.2.0 822b41a8... Same as 4.0.1
4.3.1 6000a78c... Trailing semicolon preserved

Test Script

-- Test R-script without trailing comment
CREATE OR REPLACE VIEW test_view2 AS
SELECT 1 AS col1, 'test' AS col2;

Render Output Comparison

4.0.1 content ends with: ...col2 (semicolon stripped)
4.3.1 content ends with: ...col2; (semicolon preserved)

This is a different regression from #414 (which addressed trailing comments).

Impact

All scripts ending with a semicolon (most scripts) will have different checksums in 4.3.1 compared to earlier versions, causing:

  • V-scripts: "Checksum has drifted" warnings
  • R-scripts: Unexpected re-execution

Workaround

For now, stay on 4.2.0:

pip install schemachange==4.2.0

Next Steps

We're investigating the code changes and will determine whether this requires a 4.3.2 hotfix or can be addressed in 4.4.0. Will update #417 with our findings and timeline.

@sfc-gh-jhansen - This appears to be related to the trailing comment fix changes that may have inadvertently affected how statements are trimmed before checksum computation.


Regarding PR #339

@luisggc - This PR still has merge conflicts with master. Before we can proceed with accepting the continue-on-error changes:

  1. Rebase against master: Please rebase your branch against the latest master
  2. Resolve merge conflicts: Address any conflicts that arise during the rebase
  3. Address previous feedback: Ensure the changes align with the design constraints outlined in my earlier comment (continue-on-error for R and A scripts only, not V scripts)

If you need help resolving the merge conflicts, feel free to reach out and we can assist.

@zroytman
Copy link

Thanks, @sfc-gh-tmathew , for opening a new issue #417 .
Just tried to use your workaround suggestion - it's improving the situation, but still is trying to run more than 30 scripts (out of 4K scripts) - this should be avoided too, since all those scripts were already deployed, and when i return back to 4.0.1 this doesn't happen, so even between 4.0.1 and 4.2.0 checksums there is some difference, so currently i'm going back to 4.0.1 till we have the solution.

@zroytman
Copy link

Thanks, @sfc-gh-tmathew , for opening a new issue #417 . Just tried to use your workaround suggestion - it's improving the situation, but still is trying to run more than 30 scripts (out of 4K scripts) - this should be avoided too, since all those scripts were already deployed, and when i return back to 4.0.1 this doesn't happen, so even between 4.0.1 and 4.2.0 checksums there is some difference, so currently i'm going back to 4.0.1 till we have the solution.

Thanks, @sfc-gh-tmathew . As i mentioned in #417 , the version 4.3.2 is working great now, thanks!

@zroytman
Copy link

Regarding PR #339

@luisggc - This PR still has merge conflicts with master. Before we can proceed with accepting the continue-on-error changes:

  1. Rebase against master: Please rebase your branch against the latest master
  2. Resolve merge conflicts: Address any conflicts that arise during the rebase
  3. Address previous feedback: Ensure the changes align with the design constraints outlined in my earlier comment (continue-on-error for R and A scripts only, not V scripts)

If you need help resolving the merge conflicts, feel free to reach out and we can assist.

Hi @luisggc , any updates or ETA regarding such a great enhancement you've developed? Thanks.

@luisggc
Copy link
Author

luisggc commented Feb 18, 2026

Hi all! i had my notifications off. I'll resolve the conflict and deal with the feedback.
And, yes. It makes sense your comment about the V scripts.

@luisggc
Copy link
Author

luisggc commented Feb 18, 2026

@zroytman changes done. Let me know if I have to undo some version change i have done in this PR likeI have done in setup.cfg

@zroytman
Copy link

@zroytman changes done. Let me know if I have to undo some version change i have done in this PR likeI have done in setup.cfg

Thank you, @luisggc . Was it already merged to master so I can try it? And what's the new flag you've added and how should i use it?

Thanks.

@luisggc
Copy link
Author

luisggc commented Feb 18, 2026

Not in master. They still need to review, but I have made the necessary changes.
The change is in the PR description. It adds the flags:

Parameter Environment Variable Description
--continue-all-on-error Continue executing remaining repeatable and always scripts even if one fails. Versioned scripts always stop on failure (default: false)
--continue-repeatable-on-error Continue executing remaining repeatable scripts after an error (default: false)
--continue-always-on-error Continue executing remaining always scripts after an error (default: false)

@zroytman
Copy link

Thank you again, @luisggc .

Hi @sfc-gh-tmathew , who can approve the PR?

Thanks.

@zroytman
Copy link

zroytman commented Mar 3, 2026

Hi @sfc-gh-tmathew , any updates regarding reviewing the PR and pushing to master?
Thanks.

@sfc-gh-tmathew
Copy link
Collaborator

sfc-gh-tmathew commented Mar 9, 2026

Hey @luisggc,

Thanks for putting this together — appreciate the contribution.

The continue-on-error logic itself makes sense and the approach is sound. I do want to flag a few things before we can merge.

Move the ERROR_MESSAGE column handling to initialization with schema validation

Right now, the ERROR_MESSAGE column gets added via auto-ALTER when an INSERT fails at runtime. This is risky because it introduces DDL in the middle of a deploy, and if the role doesn't have ALTER privileges, the deploy breaks — even on successful scripts (since error_message='' is passed on success too).

Instead, this should be handled upfront during initialization as a schema validation step. Before any scripts run, schemachange should check that the change history table has all required columns (including ERROR_MESSAGE). If it doesn't:

  • If the role has privileges, ALTER the table to add the missing column
  • If the role doesn't have privileges, fail immediately with a clear message telling the user exactly what column needs to be added manually

This applies regardless of how the table got there — whether it was created by --create-change-history-table, created manually by a DBA, or created by an older version of schemachange. The deployment should not proceed until the change history table matches the expected schema. This keeps the deploy loop clean and gives users a predictable, upfront failure rather than something blowing up mid-deploy.

Suggested split into 3 PRs

  1. Change history table schema validation + ERROR_MESSAGE column — Upfront validation at initialization, with migration support. Can land as a patch release on its own.
  2. Continue-on-error flags — The core feature: config, CLI args, deploy loop. Builds on bullet 1, ships as a minor release.
  3. schemachange-no-jinja marker — Requested earlier too. This changes the rendering pipeline for all scripts and is unrelated to continue-on-error. Suggest this change be reviewed it as its own PR.

Smaller items

  • The setup.cfg and uv.lock version bumps look like rebase artifacts — those get handled at release time so you can drop them.
  • The outer try/except in deploy.py is the right structure for this feature. Just worth verifying that log output in the default fail-fast path hasn't changed in a way that surprises users.

The work here is solid — splitting it up just makes each piece safer to ship. Let me know if you want to discuss this further.

@zroytman
Copy link

Dear @luisggc & @sfc-gh-tmathew , I'm hopeless. Looks like we will never see such a great enhancement in production.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Submitted by community enhancement New feature or request Under Review This is being discussed without planned changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants