Skip to content

[hma] Queue-based hashing & matching #1829

@ThisIsMissEm

Description

@ThisIsMissEm

Whilst HTTP Requests for doing hashing does work, it may be better to use a durable queue for requests, such that failures can be more easily retried. One such option would be using RabbitMQ, though plenty of other queue based messaging systems exist (kafka, AWS SQS, Google PubSub, etc).

This would be an entirely new way to interface with HMA, so wouldn't be a small amount of work, but could provide some reliability guarantees for production workloads.

Ideally there'd be one queue for incoming requests for hashing, an internal retry-queue (in the case that an error was encountered and the hashing can be retried), as well as an error output queue and a results output queue (these could be combined into a single queue). It is important that if the error is retryable automatically, that HMA does do that, instead of deferring the requeuing to the integrator).

Perhaps this could be built in a generic way as to allow various queue drivers under the hood. It would also require HMA to have more configuration for this type of integration.

(this is a ~low-medium priority for me personally, as in my integrations I'll probably need some queuing / worker processes anyway at the moment).

Metadata

Metadata

Assignees

No one assigned

    Labels

    hmaItems related to the hasher-matcher-actioner systemwontfix

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions