Skip to content

Add X-Ray Trace ID to Redis Cluster #476

@jimleroyer

Description

@jimleroyer

Description

Add metadata payload to Redis buffer queues messages to store notification's metadata

As a sysops in the GCNotify team,
I want to keep track of notification retries when stored in the Redis buffer queues,
So that I can detect poisoned notifications
And move these into a dead letter queue.

As a sysops in the GCNotify team,
I want to keep track of the AWS Message/Trace ID when stored in the Redis buffer queues,
So that I can keep track of the overall message flow using X-Ray.

As a sysops in the GCNotify team,
I want to keep track of the notification creation time when stored in the Redis buffer queues,
So that I can get monitoring insights when looking at stuck notifications
and graph widgets on Redis buffer queues behavior
and expire notifications if necessary (to move into the dead letter queue).

As a developer of Notify,
I would like to be able to trace notifications through the entirety of our system
so that I can better monitor and troubleshoot stuff.

WHY are we building?

We need to add safety and fallback mechanisms for when a notification is stuck in our Redis buffer queues, as well as passing extra information down the stack such as the AWS Message/trace ID.

Assist with performance tuning
A good step to add a holistic view of Notify
Support email tracing

WHAT are we building?

Add a payload around each notification stored in one of the Redis buffer queue.

Hence if a notification stored in the Redis buffer queue currently looks like this:

{
  "id": "f79ef62b-8ce5-442f-92e3-471b83511755",
  "service_id": "...",
   ...
}

Then with the metadata payload, this might be:

{
  "metadata": {
    "created_at": "2024-11-29T00:12:43+00:00",
    "updated_at": "2024-11-29T00:12:43+00:00",
    "retry": 0,
    "extra": {
      "aws_message_id": "2f575c4f-4fd1-45e2-9595-e3f82ef05057"
    }
  }
  "body": {
    "id": "f79ef62b-8ce5-442f-92e3-471b83511755",
    "service_id": "...",
     ...
  }
}

VALUE created by our solution

This will allow us to add extra safety and observability features to our stack.

Acceptance Criteria

  • Api repo code changes work with both old versions of the redis items and this new version
  • A payload message is wrapped around each notification stored in our Redis buffer queues.
  • The payload has a version attribute in case that it changes and support new fields or modified existing ones.
  • The payload has required fields such as notification creation time, last message update time, number of retries and an extra field storing flexible information such as the AWS Message/Trace ID. The payload contains a body field that is the actual notification message.

QA Steps

  • Verify in Redis that new notifications are wrapped with a payload containing metadata fields listed in the acceptance criteria.
  • Make sure the body of the notification is also contained in the messages.
  • Smoke tests are performed through the API to verify basic functionality.
  • Smoke tests are performed through the Admin to verify basic functionality.
  • Rollercoaster tests are performed to make sure there is no performance degradation. This should be compared to a baseline prior to the changes.
  • Cypress tests run fine against these changes.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions