Skip to content

Refactor Jinja environment variable handling #353

Open
MACKAT05 wants to merge 5 commits intoSnowflake-Labs:masterfrom
MACKAT05:enhance-data-injection
Open

Refactor Jinja environment variable handling #353
MACKAT05 wants to merge 5 commits intoSnowflake-Labs:masterfrom
MACKAT05:enhance-data-injection

Conversation

@MACKAT05
Copy link
Contributor

This pull request introduces a major refactor and enhancement to how local data files are injected into Jinja templates, replacing the legacy JinjaEnvVar approach with the new LocalDataInjection class. It adds new functions for loading CSV, JSON, and YAML files directly into templates, updates documentation, and provides comprehensive demos comparing the new and legacy methods. The changes are backward compatible and include deprecation notices for legacy features.

Local Data Injection Enhancements

  • Refactored the legacy JinjaEnvVar class into the new LocalDataInjection class, which now provides the env_var() function and introduces from_csv(), from_json(), and from_yaml() functions for loading local data files in Jinja templates. This enables direct access to structured data from CSV, JSON, and YAML files within templates. (CHANGELOG.md [1] README.md [2] [3]
  • Updated the Jinja template processor and documentation to use LocalDataInjection and its new functions, including detailed usage examples and a dedicated documentation file. (README.md [1] [2] docs/LocalDataInjection.md [3]

Demo and Documentation

  • Added a comprehensive demo (demo/citibike_demo_jinja_data_injection) showcasing both the new LocalDataInjection approach and the legacy method, including SQL scripts and configuration files that demonstrate loading and processing data from external files. (demo/citibike_demo_jinja_data_injection/1_setup/A__setup.sql [1] 2_test/V1.1.0__initial_database_objects.sql [2] 2_test/V1.1.0__initial_database_objects_legacy.sql [3] and related files)

Deprecation and Backward Compatibility

  • Deprecated the JinjaEnvVar class in favor of LocalDataInjection, while maintaining backward compatibility for existing templates using env_var(). (CHANGELOG.md CHANGELOG.mdR6-R22)

Configuration Files for Demo

  • Added example configuration files (file_formats.json, stages.yaml) and their legacy Jinja equivalents to support the demo and illustrate the benefits of loading data from external sources. (file_formats.json [1] stages.yaml [2] file_formats_legacy.j2 [3]

These changes make it much easier and more maintainable to inject local structured data into Jinja templates, improving both developer experience and template flexibility.

@MACKAT05
Copy link
Contributor Author

MACKAT05 commented Sep 1, 2025

I have yet to try to deploy the new example... i have run the render command and inspected the results... the main focus was to show the macro pattern that would inflate structured data instead of hard coded SQL and that the structured data was loading without throwing fits about not being able to find files.

README.md Outdated

##### from_csv, from_json, from_yaml

These functions provide access to local data files for use in Jinja templates. For detailed documentation and examples, see [LocalDataInjection.md](docs/LocalDataInjection.md).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference to LocalDataInjection.md is invalid. We can remove .md if the file is still valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i missed this one. I used AI tooling to help create examples ( without posting sensitive customer/ client data) it unfortunately was very aggressive about littering the repository with .md files and not using the existing markdown file entries.

I think this should at this point be revised to:
For examples, see citibike_demo_jinja_data_injection.

How are you finding this feature branch otherwise?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not completed my review. I generally understand what you are doing.

Curious - Was there a scenario you faced that lead to developing this solution? Does it relate to an open PR?

Copy link

@sahil-walia sahil-walia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for contributing.

README.md Outdated

##### from_csv, from_json, from_yaml

These functions provide access to local data files for use in Jinja templates. For detailed documentation and examples, see [LocalDataInjection.md](docs/LocalDataInjection.md).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not completed my review. I generally understand what you are doing.

Curious - Was there a scenario you faced that lead to developing this solution? Does it relate to an open PR?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not rename this file from schemachange/JinjaEnvVar.py to JinjaTemplateDataProvider and code from your localdatainjection.py since we are extending the capability?

I think we should not name the file with Injection as it is generally perceived with negative connotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have plans to add additional modules that invoke snowflake connections and connections to other databases. to allow database introspection. I'm not opposed to renaming them. however with that in mind i also had thought it best to keep them as seperate modules to allow the default option to not load them to employ some level surface reduction from the security stand point hence local data injection and remote data injection.

expected = "John is 30 years old\nJane is 25 years old\n"
assert result == expected
finally:
os.unlink(csv_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add one blank line at the end

with open(yaml_path, encoding=encoding) as yamlfile:
data = yaml.safe_load(yamlfile)

return data

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a blank line at the end.

return data

@staticmethod
def from_json(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting inconsistencies .

Suggested change
def from_json(
def from_json(
file_path: str,
encoding: str = "utf-8"
) -> dict[str, Any] | list[Any]:

return data

@staticmethod
def from_yaml(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting for consistency similar to from_csv

Suggested change
def from_yaml(
def from_yaml(
file_path: str,
encoding: str = "utf-8"
) -> dict[str, Any] | list[Any]:

csv_path = Path(file_path)
if not csv_path.exists():
raise FileNotFoundError(f"CSV file not found: {file_path}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation for input delimiter. .

    if not isinstance(delimiter, str) or len(delimiter) != 1:
        raise ValueError("Delimiter must be a single character")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more to check file is actually a csv

    if not csv_path.suffix.lower() == '.csv':
        raise ValueError(f"File {file_path} is not a CSV file")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about tsv? i added some additional functionality to check if the file looks like the expected csv and to throw warnings instead of adding a requirement that a file be suffixed with .csv

@sfc-gh-tmathew sfc-gh-tmathew added Under Review This is being discussed without planned changes community-contribution Submitted by community target: 4.3.0 Planned for 4.3.0 release labels Nov 18, 2025
@sfc-gh-tmathew sfc-gh-tmathew added target: 4.4.0 Planned for 4.4.0 release and removed target: 4.3.0 Planned for 4.3.0 release labels Feb 9, 2026
@sfc-gh-tmathew
Copy link
Collaborator

sfc-gh-tmathew commented Feb 9, 2026

Hi @MACKAT05,

Thank you for this comprehensive enhancement to Jinja data injection! The LocalDataInjection class with from_csv(), from_json(), and from_yaml() functions is a great addition that will improve template flexibility.

We've just released 4.3.0, and this PR is targeted for 4.4.0.

Action needed: This PR has merge conflicts with the master branch. Could you please rebase against master to resolve them?

Once rebased, we'll proceed with final review and merge. Looking forward to including this in 4.4.0!

@MACKAT05 MACKAT05 force-pushed the enhance-data-injection branch from 8085e72 to 89c741c Compare February 18, 2026 21:54
@MACKAT05
Copy link
Contributor Author

rebased and split commits. i added a couple of updates along the lines of the provided commentary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Submitted by community target: 4.4.0 Planned for 4.4.0 release Under Review This is being discussed without planned changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants