Skip to content

Feature/S3 to Postgres optional column order detection #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sleblanc23
Copy link
Contributor

Description & motivation:

The s3_to_postgres and s3_dir_to_postgres callables currently rely on the S3 file(s) and the Postgres table to have matching column orders unless the order is specified in the column_customization config. This PR adds the option to set the column order by reading the header of the file in S3. It will be helpful for the upgrade of the load_heimdall_dag, currently the only user of this callable, to avoid errors due to column order.

I'm not convinced that I decided on the ideal implementation of this idea, with the user specifying the column delimiter to use the feature. I considered making the argument a boolean and parsing the delimiter from the options argument. That would be a simpler argument but would require some ugly regex, so I decided against it.

PR Merge Priority:

  • Low

Tests and QC done:

  • Created a test table in GSN with a different column order than the test file in S3 and confirmed that the table was loaded correctly
  • Confirmed that excluding the new optional configuration does not affect exiting functionality

@sleblanc23 sleblanc23 requested a review from jayckaiser March 26, 2025 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant