Skip to content

Conversation

@syhan
Copy link

@syhan syhan commented Jan 29, 2026

Description

This PR adds HMAC (Hash-based Message Authentication Code) hash functions (hmac-sha256 and hmac-sha512) to the redaction processor, enabling GDPR-compliant pseudonymization of sensitive data such as IP addresses in telemetry data.

Problem:
The current redaction processor supports simple hash functions (MD5, SHA1, SHA3), which are vulnerable to rainbow table attacks for low-entropy data like IP addresses. IPv4 has only ~4.3 billion possible values (2^32), making it feasible to pre-compute all possible hashes and reverse them. This violates GDPR Article 4(5) requirements for true pseudonymization.

Solution:
HMAC addresses this security gap by requiring a secret key, making it practically impossible to reverse-engineer original values without the key while maintaining consistency (same input + same key = same output) required for pattern analysis.

Changes:

  • config.go: Added HMACSHA256 and HMACSHA512 hash function types, HMACKey configuration field
  • processor.go: Implemented HMAC hashing logic with hashStringHMAC helper function
  • processor_test.go: Added 4 comprehensive test cases for HMAC functionality
  • README.md: Added comprehensive HMAC documentation including security considerations and key management

Usage Example:

processors:
  redaction:
    allow_all_keys: true
    blocked_values:
      - "(?:[0-9]{1,3}\\.){3}[0-9]{1,3}"  # IPv4 addresses
    hash_function: hmac-sha256  # or hmac-sha512
    hmac_key: "${env:REDACTION_SECRET_KEY}"
    summary: silent
# Key generation
export REDACTION_SECRET_KEY=$(openssl rand -hex 32)

Link to tracking issue

N/A - This is a new feature addition for GDPR compliance. No existing issue tracking this enhancement.

Testing

Added 4 comprehensive unit tests covering:

  • HMAC-SHA256 basic functionality
  • HMAC-SHA512 basic functionality
  • Hash consistency (same input + same key = same output)
  • Different keys produce different hashes

All new tests and existing tests pass successfully:

=== RUN   TestRedactSummaryDebugHashHMACSHA256
--- PASS: TestRedactSummaryDebugHashHMACSHA256 (0.00s)
=== RUN   TestRedactSummaryDebugHashHMACSHA512
--- PASS: TestRedactSummaryDebugHashHMACSHA512 (0.00s)
=== RUN   TestHMACConsistency
--- PASS: TestHMACConsistency (0.00s)
=== RUN   TestHMACDifferentKeys
--- PASS: TestHMACDifferentKeys (0.00s)
PASS
ok  	github.com/open-telemetry/opentelemetry-collector-contrib/processor/redactionprocessor	0.255s

Manual testing:

  • Built and verified binary includes HMAC functions
  • Tested HMAC-SHA256 and HMAC-SHA512 with sample configurations
  • Verified consistent output with same key
  • Verified different output with different keys

Documentation

Updated processor/redactionprocessor/README.md with:

  • Complete HMAC section explaining security advantages over simple hash functions
  • Configuration examples for both hmac-sha256 and hmac-sha512
  • Key generation and management best practices
  • GDPR Article 4(5) compliance explanation
  • Security recommendations for production use
  • Performance impact notes

Key Benefits:

  • ✅ Rainbow table resistant (pre-computed hash tables are useless without the key)
  • ✅ GDPR compliant (meets pseudonymization requirements per Article 4(5))
  • ✅ Backward compatible (existing configurations continue to work)
  • ✅ Consistent output (same input with same key = same hash, enabling analytics)
  • ✅ Minimal performance overhead (~2x slower than MD5, negligible in most deployments)

Security Considerations:

  • HMAC-SHA256 recommended for most use cases
  • Keys should be at least 256 bits (32 bytes)
  • Keys must be stored separately from log data (e.g., Kubernetes Secrets, Vault)
  • Regular key rotation recommended per security policy

References:

@syhan syhan requested review from a team, TylerHelmuth, dmitryax and mx-psi as code owners January 29, 2026 04:18
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jan 29, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@github-actions github-actions bot added the first-time contributor PRs made by new contributors label Jan 29, 2026
@github-actions
Copy link
Contributor

Welcome, contributor! Thank you for your contribution to opentelemetry-collector-contrib.

Important reminders:

A maintainer will review your pull request soon. Thank you for helping make OpenTelemetry better!

@github-actions github-actions bot added the processor/redaction Redaction processor label Jan 29, 2026
@github-actions github-actions bot requested a review from iblancasa January 29, 2026 04:18
@syhan syhan changed the title Add hmac hash support [processor/redaction] Add HMAC support for GDPR compliance Jan 29, 2026
Copy link
Contributor

@iblancasa iblancasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but would like to see what other code owners think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

first-time contributor PRs made by new contributors processor/redaction Redaction processor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants