Description
Describe the bug
Meltano/Singer uses JSON schema to describe the shape and data types of fields in a record. This usually looks like
"CreatedById": {"type": ["null", "string"]}
for a field but one valid option is "CreatedDate": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": ["string", "null"]}]}
. target-redshift
handles the first case but fails with a KeyError in the second.
joblib.externals.loky.process_executor._RemoteTraceback:
"""
Traceback (most recent call last):
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/joblib/_utils.py", line 72, in __call__
return self.func(**kwargs)
^^^^^^^^^^^^^^^^^^^
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/joblib/parallel.py", line 598, in __call__
return [func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/joblib/parallel.py", line 598, in <listcomp>
return [func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 541, in _drain_sink
self.drain_one(sink)
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 531, in drain_one
sink.process_batch(draining_status)
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/target_redshift/sinks.py", line 127, in process_batch
self.bulk_insert_records(
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/target_redshift/sinks.py", line 184, in bulk_insert_records
self.write_to_s3(records)
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/target_redshift/sinks.py", line 270, in write_to_s3
records = self.format_records_as_csv(records)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/target_redshift/sinks.py", line 251, in format_records_as_csv
object_keys = [
^
File "/project/.meltano/loaders/target-redshift/venv/lib/python3.11/site-packages/target_redshift/sinks.py", line 254, in <listcomp>
if "object" in value["type"] or "array" in value["type"]
~~~~~^^^^^^^^
KeyError: 'type'
To Reproduce
Steps to reproduce the behavior:
Send these message to target-redshift
{"type": "STATE", "value": {}}
{"type": "SCHEMA", "stream": "Profile", "schema": {"type": "object", "additionalProperties": false, "properties": {"Id": {"type": "string"}, "SystemModstamp": {"anyOf": [{"type": "string", "format": "date-time"}, {"type": ["string", "null"]}]}}}, "key_properties": ["Id"], "bookmark_properties": ["SystemModstamp"]}
{"type": "ACTIVATE_VERSION", "stream": "Profile", "version": 1746227674772}
{"type": "RECORD", "stream": "Profile", "record": {"Id": "00a00000001aa0aAAA", "SystemModstamp": "2025-04-23T16:38:02.000000Z"}, "version": 1746227674772, "time_extracted": "2025-05-02T23:14:34.789503Z"}
Expected behavior
target-redshift successfully processes SCHEMA messages that use "anyOf".
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
Additional context
The issue is in
target-redshift/target_redshift/sinks.py
Line 254 in 16e532e