Skip to content

Surface script execution#351

Open
MACKAT05 wants to merge 6 commits intoSnowflake-Labs:masterfrom
MACKAT05:surface-script-execution
Open

Surface script execution#351
MACKAT05 wants to merge 6 commits intoSnowflake-Labs:masterfrom
MACKAT05:surface-script-execution

Conversation

@MACKAT05
Copy link
Contributor

This pull request introduces a major architectural refactor to the project, focusing on improving script execution, session and history management, and error handling. The execution logic has been moved from the Snowflake connector directly into this package, enabling more robust control and separation of concerns. A new Script class hierarchy now handles parsing and execution of individual SQL statements, and dedicated session classes manage state and history. SQL parsing and error reporting are also significantly enhanced.

Architectural Refactor and Session Management

  • Moved execution logic from the Snowflake connector to this package, centralizing control and improving separation of concerns. All script execution now routes through the session layer for consistent state management. (CHANGELOG.md, CHANGELOG.mdR6-R15)

Script Execution and Parsing

  • Added a dedicated Script class hierarchy (Script.py) that parses scripts into individual statements with context, executes each statement, and provides detailed execution reports with statement-level success/failure and error tracking. (Script.py, [1] [2]
  • Improved SQL statement parsing to support both commented and non-commented versions, ensuring accurate execution and reporting. (Script.py, schemachange/session/Script.pyR183-R413)

Session and History Management

  • Introduced a new HistorySession class for managing change history table operations, including schema/table creation, metadata fetching, and script history retrieval. (HistorySession.py, schemachange/session/HistorySession.pyR1-R271)
  • Enhanced the SnowflakeSession class (referenced in new code) to manage session state, query tags, and execution routing. (CHANGELOG.md, CHANGELOG.mdR6-R15)

Error Handling and Reporting

Miscellaneous

note: the change in the session pattern does create a second login call which may annoy or frustrate users who do not have MFA caching enabled properly...

…actor and enhanced session management

- Moved execution logic from Snowflake connector to improve control and separation of concerns.
- Introduced a dedicated  class hierarchy for SQL statement handling.
- Enhanced  for better session state management and query tagging.
- Created  class for managing change history operations.
- Improved SQL statement parsing and error handling.
- Updated version in setup.cfg and CHANGELOG.md to reflect upcoming release.
… retrieval

- Added logic to skip execution of empty SQL statements in the Script class.
- Improved session state and query tag retrieval by switching from async to simple execution in the SnowflakeSession class.
- Consolidated reset query execution into a single anonymous block for efficiency.
…xecution

- Enhanced the Script class to manage transactions by rolling back on statement failures or exceptions.
- Updated the execute_snowflake_query method in SnowflakeSession to support explicit auto-commit control.
- Improved logging for transaction outcomes and error reporting during script execution.
def sql(self) -> str:
"""Return the SQL to execute (without comments for non-empty, with comments for empty)"""
return (
self.sql_without_comments if not self.is_empty else self.sql_with_comments
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this should return the sql with comments.

logger: structlog.BoundLogger,
) -> ExecutionResult:
"""Execute a single statement and return the result"""
import time
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to top...

… creation

- Removed direct instantiation of HistorySession in SnowflakeSession, replacing it with a lazy initialization approach.
- Added a new property to fetch change history metadata, enhancing the separation of concerns.
- Updated tests to mock HistorySession behavior for improved isolation and reliability.
…ed sql. tests adjusted to reflect. consolidated imports.
@sfc-gh-tmathew
Copy link
Collaborator

@MACKAT05 Could we revisit these changes in a future release. There will be a release 4.1.0 that has been focused on supporting the cadence of arguments from CLI > ENV > Config YAML > connections.toml that will be released soon.

Could we meet to discuss this change and see where we can table this in the roadmap to 4.2.0, 4.3.0, 4.4.0, 4.5.0 or 5.0.0?

@sfc-gh-tmathew sfc-gh-tmathew added Under Review This is being discussed without planned changes community-contribution Submitted by community labels Nov 18, 2025
@sfc-gh-tmathew
Copy link
Collaborator

sfc-gh-tmathew commented Feb 9, 2026

Hi @MACKAT05,

Thank you for this architectural refactor! This is a significant change that moves execution logic into the package and introduces new Script and HistorySession class hierarchies.

A few questions before we proceed:

  1. MFA Concern: You mentioned this creates a second login call - can you elaborate on the impact? Is there a way to avoid this?
  2. Scope: This overlaps with some internal plans we have for 5.0.0. Could you help us understand the primary problem this solves? Is it:
    • Better error reporting per statement?
    • Transaction control?
    • Something else?
  3. Testing: Has this been tested with large script sets? The SQL parsing changes are substantial. Given the scope, we'd like to discuss whether to:
    • Include parts of this in 4.4.0/4.5.0
    • Coordinate with our 5.0.0 architecture plans
    • Or proceed as-is

This PR has conflicts - please hold off on rebasing until we align on direction.

Thanks for your patience!

@MACKAT05
Copy link
Contributor Author

Hi @MACKAT05,

Thank you for this architectural refactor! This is a significant change that moves execution logic into the package and introduces new Script and HistorySession class hierarchies.

A few questions before we proceed:

  1. MFA Concern: You mentioned this creates a second login call - can you elaborate on the impact? Is there a way to avoid this?
    Yes one of the features here is to open a second session with snowflake to run the logging activity. This was motivated by seeing the extra activity required to switch scope between running each script and logging each script and reseting the conneciton state after one of the earlier changes we made to just use one connection.
    It seems like the python connector archetectually permit more than one session per authorization or an option to "reuse" the authorization... a session is encapsualted by a connection class. and those are instantiated by the .connect method...
    Its unclear if the "ALLOW_ID_TOKEN" parameter would bypass resubmitting authentication.
  1. Scope: This overlaps with some internal plans we have for 5.0.0. Could you help us understand the primary problem this solves? Is it:

    • Better error reporting per statement?
    • Transaction control?
    • Something else?

This removes a big capability issue. The connector method that was used to submit scripts handled parseing and splitting statements and was the source of a number of the "weird" behaviors that have been complained about. reading the code on the python connector it is also marked as not for production or some such verbiage as it was provided as a convience method. Moving the parsing responsibility to this tool ensures we can fix issues in the future ( like correctly submitting code when there were trailing comments or SQL stored proceduredures where there are semi colons in series.)

  1. Testing: Has this been tested with large script sets? The SQL parsing changes are substantial.

Large scripts or large sets. No to both. From what I recall the parsing was mostly the same flow and method wise. With additional steps currently to resolve send the data over the wire. I think it ends up loading the data twice and compares it but i suspect that parsing may not be impacted much since we are reading from memory instead of disk. agree we need to test this though.

Given the scope, we'd like to discuss whether to:

  • Include parts of this in 4.4.0/4.5.0
  • Coordinate with our 5.0.0 architecture plans
  • Or proceed as-is

This PR has conflicts - please hold off on rebasing until we align on direction.

Thanks for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Submitted by community Under Review This is being discussed without planned changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants