[hma] Queue-based hashing & matching

Whilst HTTP Requests for doing hashing does work, it may be better to use a durable queue for requests, such that failures can be more easily retried. One such option would be using RabbitMQ, though plenty of other queue based messaging systems exist (kafka, AWS SQS, Google PubSub, etc).

This would be an entirely new way to interface with HMA, so wouldn't be a small amount of work, but could provide some reliability guarantees for production workloads.

Ideally there'd be one queue for incoming requests for hashing, an internal retry-queue (in the case that an error was encountered and the hashing can be retried), as well as an error output queue and a results output queue (these could be combined into a single queue). It is important that if the error is retryable automatically, that HMA does do that, instead of deferring the requeuing to the integrator).

Perhaps this could be built in a generic way as to allow various queue drivers under the hood. It would also require HMA to have more configuration for this type of integration.

(this is a ~low-medium priority for me personally, as in my integrations I'll probably need some queuing / worker processes anyway at the moment).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[hma] Queue-based hashing & matching #1829

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[hma] Queue-based hashing & matching #1829

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions