Skip to content

Queue and Mail Contract

Antonios Voulvoulis edited this page Apr 14, 2026 · 1 revision

Queue and Mail Contract

Interface contract for the task queue and mail delivery systems. All modules that interact with these systems MUST follow this contract.

For all tunable variables, see Configuration Reference.


Table of Contents


Task Queue System

Reliable async processing with retry and dead-letter queue (DLQ) support.

Flow: pending/ → work/ → success (delete) | failure (retry → DLQ)

Task File Schema

Location: ${NFTBAN_QUEUE_PENDING_DIR}/<task_id>.task

# REQUIRED fields
TASK_ID="<timestamp>-<type>-<rand>"      # Unique identifier (e.g., "20260107T120102Z-feeds_sync-a1b2")
TASK_TYPE="<type>"                        # One of: feeds_sync, geoban_apply, mail_send
TASK_DESCRIPTION="<text>"                 # Human-readable description
TASK_CREATED_EPOCH="<unix_timestamp>"     # When task was created

# RETRY fields (managed by queue processor)
TASK_RETRIES="0"                          # Current retry count (0 = first attempt)
TASK_NEXT_ATTEMPT_EPOCH="0"               # Unix timestamp when task becomes eligible (0 = immediate)
TASK_LAST_ERROR=""                        # Sanitized error from last failure (max 200 chars)

# OPTIONAL fields
TASK_PAYLOAD_FILE=""                      # Path to additional data file (for mail_send tasks)

# DLQ fields (only present after moving to DLQ)
TASK_DLQ_EPOCH="<unix_timestamp>"         # When task was moved to DLQ
TASK_DLQ_REASON="max_retries_exceeded"    # Reason for DLQ placement

Task Types

Type Description Handler Function
feeds_sync Sync threat feeds to nftables nftban_feeds_sync_to_nftables()
geoban_apply Apply geographic IP blocks nftban_geoban_apply_to_nftables()
mail_send Send spooled email Loads payload, calls nftban_mail_send()

Public API Functions

nftban_queue_add()

Add a task to the pending queue.

# Signature
nftban_queue_add <task_type> [description] [payload_file]

# Parameters
#   task_type     - Required. One of: feeds_sync, geoban_apply, mail_send
#   description   - Optional. Human-readable description (defaults to task_type)
#   payload_file  - Optional. Path to additional data (required for mail_send)

# Returns
#   stdout: Task ID (e.g., "20260107T120102Z-feeds_sync-a1b2")
#   exit 0: Success

# Example
task_id=$(nftban_queue_add "feeds_sync" "Feed enabled: spamhaus")

nftban_queue_process_next()

Process the next eligible task. Called by systemd timer.

# Signature
nftban_queue_process_next

# Returns
#   exit 0: Task processed successfully
#   exit 1: No eligible tasks (all waiting for backoff or queue empty)
#   exit 2: Lock held by another process
#   exit 3: Processing error (task will retry via DLQ system)

# Behavior
# - Acquires exclusive lock (prevents concurrent processing)
# - Finds oldest task where NEXT_ATTEMPT_EPOCH <= now
# - Moves task to work/ directory (atomic claim)
# - Executes handler based on TASK_TYPE
# - On success: deletes task, increments processed_total
# - On failure: increments RETRIES, calculates backoff, moves to pending/
# - If RETRIES >= MAX_RETRIES: moves to DLQ

nftban_queue_dlq_retry()

Retry a specific DLQ task.

# Signature
nftban_queue_dlq_retry <task_id>

# Behavior
# - Resets TASK_RETRIES to 0
# - Clears TASK_LAST_ERROR
# - Moves task from DLQ back to pending/

nftban_queue_dlq_list()

List all tasks in dead-letter queue.

# Signature
nftban_queue_dlq_list

# Output
# Lists task ID, type, description, and failure reason for each DLQ task

nftban_queue_dlq_purge()

Purge old DLQ tasks.

# Signature
nftban_queue_dlq_purge [days]

# Parameters
#   days - Optional. Delete tasks older than N days (default: 7)

Retry Policy

Retry # Backoff Delay Total Wait
1 60s 1m
2 120s 3m
3 240s 7m
4+ DLQ -

Formula: backoff = min(BACKOFF_MAX, BACKOFF_BASE * 2^retries)

For default values (MAX_RETRIES, BACKOFF_BASE, BACKOFF_MAX), see Configuration Reference.


Lock Behavior

  • Lock file: ${NFTBAN_RUN_DIR}/queue.lock
  • Contains: PID on line 1, timestamp on line 2
  • Stale lock detection: Process doesn't exist
  • Stuck lock recovery: After LOCK_STUCK_THRESHOLD seconds (see Configuration Reference):
    • SIGTERM to process, wait 5s
    • SIGKILL if still alive
    • Move orphaned work/ tasks back to pending/
    • Remove lock file

Queue Metrics

File: ${NFTBAN_DATA_DIR}/metrics/queue.prom

# Gauges (current state)
nftban_queue_tasks_pending <count>
nftban_queue_tasks_working <count>
nftban_queue_tasks_dlq <count>
nftban_queue_last_run_timestamp <unix_seconds>

# Counters (lifetime totals)
nftban_queue_tasks_processed_total <count>
nftban_queue_tasks_failed_total <count>
nftban_queue_task_retries_total <count>
nftban_queue_dlq_total <count>

Mail Delivery System

Overview

Two-tier mail system:

  1. nftban_mail_send() - Single attempt, immediate
  2. nftban_mail_send_with_retry() - Retry wrapper with queue fallback

Quick Setup

# One command to enable all email notifications:
nftban mail setup admin@example.com --all --test

See Configuration Reference for details.

Recipient Resolution

All modules resolve email recipients using this priority:

1. Module-specific override (if set and non-empty)
2. Global fallback: NFTBAN_MAIL_RECIPIENT

Implementation Pattern:

# Correct pattern for module email sending:
local recipient="${MODULE_SPECIFIC_VAR:-${NFTBAN_MAIL_RECIPIENT:-}}"
if [[ -z "$recipient" ]]; then
    echo "Warning: No recipient configured" >&2
    return 0
fi
nftban_mail_send "$content" "$recipient"

Transport Detection Priority

Detected automatically in this order: postfix, sendmail, exim, msmtp, curl (direct SMTP), mailx. For transport configuration, see Configuration Reference.


Mail API Functions

nftban_mail_send()

Single send attempt. Used internally and for simple cases.

# Signature
nftban_mail_send <content> [recipient]

# Parameters
#   content    - Text content OR path to file
#   recipient  - Email address (defaults to NFTBAN_MAIL_RECIPIENT)

# Returns
#   exit 0: Send succeeded
#   exit 1: Send failed (no retry)

# Behavior
# - Detects best available MTA
# - Wraps content in HTML template (if NFTBAN_MAIL_USE_HTML=YES)
# - Sends via detected transport
# - Does NOT retry on failure

nftban_mail_send_with_retry()

Production wrapper with retry and queue fallback.

# Signature
nftban_mail_send_with_retry <content> [recipient] [subject]

# Parameters
#   content    - Text content OR path to file
#   recipient  - Email address (defaults to NFTBAN_MAIL_RECIPIENT)
#   subject    - Subject line for logging/spooling

# Returns
#   exit 0: Send succeeded (possibly after retries)
#   exit 1: All retries failed, mail spooled to queue

# On Final Failure
#   - Creates mail spool directory: ${NFTBAN_MAIL_SPOOL_DIR}/<mail_id>/
#   - Saves: body.html (or body.txt), meta.sh
#   - Enqueues task: TASK_TYPE=mail_send with payload pointing to spool

Mail Spool Format

Location: ${NFTBAN_MAIL_SPOOL_DIR}/<mail_id>/

mail-20260107T120102Z-a1b2/
├── body.html          # Or body.txt
└── meta.sh            # Sourceable metadata

meta.sh contents:

MAIL_TO="admin@example.com"
MAIL_SUBJECT="[ALERT] Service failed"
MAIL_BODY_FILE="/var/lib/nftban/mailspool/mail-xxx/body.html"
MAIL_CREATED_EPOCH="1736258462"

Success/Failure Criteria

Scenario Result Action
MTA accepts message Success Return 0, increment success counter
MTA rejects (auth, dns, connect) Failure Retry with backoff
No MTA available Failure Spool immediately
All retries exhausted Failure Spool to queue

Mail Metrics

File: ${NFTBAN_DATA_DIR}/metrics/mail.prom

# Counters by transport
nftban_mail_send_attempts_total{transport="sendmail|postfix|curl|..."} <count>
nftban_mail_send_success_total{transport="..."} <count>
nftban_mail_send_failures_total{transport="..."} <count>

# Gauge
nftban_mail_last_success_timestamp <unix_seconds>

Logging Format

File: ${NFTBAN_LOG_DIR}/mail.log

# Attempt
[2026-01-07 12:01:02] [INFO] MAIL_SEND_ATTEMPT task_id=inline attempt=1 transport=postfix to=admin@...

# Success
[2026-01-07 12:01:03] [INFO] MAIL_SEND_RESULT task_id=inline status=success transport=postfix attempt=1

# Failure
[2026-01-07 12:01:15] [ERROR] MAIL_SEND_RESULT task_id=inline status=failed transport=postfix retries=3 last_error=connect_timeout

Module Integration Guide

Adding a Task to Queue (Correct Pattern)

# Always guard queue function availability
if command -v nftban_queue_add &>/dev/null; then
    nftban_queue_add "feeds_sync" "Feed updated: $feed_name" >/dev/null 2>&1
    echo "Update queued (processed every 2 minutes)"
else
    echo "Update will run on next timer cycle"
fi

Sending Email with Retry (Correct Pattern)

# For important notifications, use retry wrapper
if type nftban_mail_send_with_retry &>/dev/null; then
    nftban_mail_send_with_retry "$content" "$recipient" "Alert subject"
else
    # Fallback to single attempt
    nftban_mail_send "$content" "$recipient" || true
fi

Checking Queue Health

# In health checks, verify queue is not stuck
pending=$(nftban_queue_count 2>/dev/null || echo "0")
dlq=$(nftban_queue_count_dlq 2>/dev/null || echo "0")

if [[ $dlq -gt 10 ]]; then
    echo "WARNING: $dlq tasks in dead-letter queue"
fi

Security Notes

  1. Task files are sourceable - Never put untrusted data in task files
  2. Error messages are sanitized - Max 200 chars, special chars stripped
  3. No credentials in logs - Recipient emails are truncated (admin@...)
  4. Lock files use PID - Stale detection prevents orphan locks
  5. Permissions - Queue directories are 750, owned by nftban:nftban

Related Documentation

Clone this wiki locally