Skip to content

target-redshift creates extra columns when stream schema has spaces #207

Open
@holly-evans

Description

@holly-evans

Describe the bug
When a given stream's schema has spaces in the property names, target-redshift creates a second set of columns, without spaces. This happens on the second run of the tap->target-redshift, when it is doing any alterations to the existing table.

To Reproduce
Steps to reproduce the behavior:

  1. Send these messages to target-redshift
{"type":"SCHEMA","stream":"application_report","schema":{"properties":{"User ID":{"type":["string","null"]}},"type":"object","$schema":"https://json-schema.org/draft/2020-12/schema"},"key_properties":[]}
{"type":"RECORD","stream":"application_report","record":{"User ID":"5337781"},"time_extracted":"2025-05-05T22:37:40.537218+00:00"}
{"type":"STATE","value":{"bookmarks":{"application_report":{}}}}
  1. View the table's columns
  2. Send these messages to target-redshift
{"type":"SCHEMA","stream":"application_report","schema":{"properties":{"User ID":{"type":["string","null"]}},"type":"object","$schema":"https://json-schema.org/draft/2020-12/schema"},"key_properties":[]}
{"type":"RECORD","stream":"application_report","record":{"User ID":"5337960"},"time_extracted":"2025-05-05T22:37:40.537514+00:00"}
{"type":"STATE","value":{"bookmarks":{"application_report":{}}}}
  1. View the table's columns, there is both "user id" and "user_id", "user id" is filled and "user_id" is empty

Expected behavior
Each property of the stream is landed once with the SQL standard snake-case.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
This is happening because target-redshift passes the original schema to prepare_table and other functions, when it should use conform_schema(schema) to pass the conformed schema.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions