Skip to content

Conversation

@kgpayne
Copy link

@kgpayne kgpayne commented Oct 26, 2021

Draft proposal for BATCH message type support as raised in #2

Rendered SIP Draft: https://github.com/kgpayne/Singer-Working-Group/blob/kp/batch_messages/proposals/draft/SIPXX%20-%20BATCH%20message%20type.md

Note: In order to support discoverability of capabilities like "BATCH" record types, we may require something like #8


This offeres several new avenues for improving performance:

- For taps and targets that _both_ support a common BATCH format (e.g. RDBMS to Warehouse) we have the opportunity to implement 'pass through', whereby records are _never deserialised_ between source to destination, greatly reducing overhead and improving throughput. This would drastically accelerate this common use-case in the singer ecosystem, inspired by the `fast_sync` feature of [Pipelinwise](https://transferwise.github.io/pipelinewise/concept/fastsync.html).
Copy link

@louis-pie louis-pie Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth mentioning that, especially when dealing with RDBMS, the COPY command is usually the best way to implement this "pass-through. And the COPY command usually requires csv.

Also, and unrelated: Pipelinwise -> PipelineWise

@judahrand
Copy link

Really excited for this - it is one of the things which has forced us to go with pure PipelineWise at the moment!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants