Skip to content

bug: Source postgres json & jsonb columns loading to snowflake as varchar true when using meltanolabs tap-postgres and target-snowflake #274

Closed
@kyle-foerster

Description

@kyle-foerster

Target Version

3.4.2

Python Version

3.9

PostgreSQL Version

16.4

Operating System

meltano/meltano:v3.4.2-python3.9 docker image

Description

Steps to fully reproduce can be found in this public repo. General configurations in this repo match our production env where we first noticed this behavior.

In swapping from using the transferwise variants of the tap-postgres and target-snowflake plugins to their respective meltanolabs variants, we noticed it will load any columns from the source of the json or jsonb data type to be varchar true in the target.

Source table was created with:

create table mix_data (
	id integer primary key,
	empty_data jsonb,
	data json
);

example source data in postgres:
Screenshot 2024-10-16 at 3 39 40 PM

example target data in Snowflake:
Screenshot 2024-10-16 at 3 40 53 PM

Quick matrix of what was tested. Only using the meltanolabs variant of both the tap-postgres and target-snowflake resulted in the malformed data.

tap-postgres variant target-snowflake variant Replicate correctly?
transferwise transferwise
transferwise meltanolabs
meltanolabs transferwise
meltanolabs meltanolabs

When looking at the schema of the actual messages that come across using the dummy data in the repo linked above, they look largely the same and what I'd expect. Not sure if including the boolean data type in the meltanolabs variant is somehow causing it to get read through incorrectly?

The schema from the meltanolabs tap-postgres:

"properties": {
            "id": {
                "type": ["integer"]
            },
            "empty_data": {
                "type": ["string","number","integer","array","object","boolean","null"]
            },
            "data": {
                "type": ["string","number","integer","array","object","boolean","null"]
            }
        },
        "type": "object",
        "required": ["id"]

Schema from the transferwise tap-postgres:

{
        "type": "object",
        "properties": {
            "id": {
                "type": ["integer"],
                "minimum": -2147483648,
                "maximum": 2147483647
            },
            "empty_data": {
                "type": ["null","object","array"]
            },
            "data": {
                "type": ["null","object","array"]
            }
        },
        "definitions": {
            "sdc_recursive_integer_array": {
                "type": ["null","object","array"],
                "items": {"$ref": "#/definitions/sdc_recursive_integer_array"}
            },
            "sdc_recursive_number_array": {
                "type": ["null","object","array"],
                "items": {"$ref": "#/definitions/sdc_recursive_number_array"}
            },
            "sdc_recursive_string_array": {
                "type": ["null","object","array"],
                "items": {"$ref": "#/definitions/sdc_recursive_string_array"}
            },
            "sdc_recursive_boolean_array": {
                "type": ["null","object","array"],
                "items": {"$ref": "#/definitions/sdc_recursive_boolean_array"}
            },
            "sdc_recursive_timestamp_array": {
                "type": ["null","object","array"],
                "format": "date-time",
                "items": {"$ref": "#/definitions/sdc_recursive_timestamp_array"}
            },
            "sdc_recursive_object_array": {
                "type": ["null","object","array"],
                "items": {"$ref": "#/definitions/sdc_recursive_object_array"}
            }
        }
    }

Link to Slack/Linen

https://meltano.slack.com/archives/C069CQNHDNF/p1728658002710249

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions