Skip to content

Support higher-volume cases (and potentially an ordering guarantee) by using pgq #139

@ePaul

Description

@ePaul

Background

We currently inside Zalando have a discussion of how to implement reliable (transactional) event sending, which is basically what this library is trying to do.

When I mentioned this library (and that we are using similar approaches in another team, where we do a nightly full vacuum), it was pointed (by @CyberDem0n):

That's actually the major problem of such homegrown solutions.

  1. Write amplification (you are not only inserting into the queue table, but also updating/deleting).
  2. Permanent table and index bloat due to the 1.
  3. Regular heavy maintenance required due to the 2.
  4. Maintenance always affects normal processes interacting with the events table.
  5. In case if the event flow is relatively high, it quickly becomes not enough to do vacuum full/reindex only once a night.

In this regard pgq is maintenance free. For every queue you create, under the hood it creates a few tables.
These tables are INSERT ONLY, therefore they are explicitly excluded from the autovacuum.
Tables are used in the round-robin matter. Since events are always processed strictly in one order it is enough only to keep the pointer to the latest row(event) that was processed and no UPDATES/DELETES required on the event table. Once all events from the specific table are processed PgQ simply does TRUNCATE on this table.
These tricks are making PgQ very scalable. Back 10 years ago, when PostgreSQL didn't yet have built-in streaming replication, the PgQ was used as a base for the logical replication, Londiste. Both solutions are developed by Mark Kreen while working for Skype. IIRC, 3 or 4 years ago Skype was still relying on PgQ and Londiste, because they just work.

@a1exsh pointed me to the pgq SQL API and promised to help with code review if we want to integrate this into this library.

Goal

Find a way of using a pgq queue instead of the current event_log table for storing the events for later Nakadi submission.
This should be optional, as not every user of this library has pgq available, or the ability to install postgresql extensions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions