Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions .chloggen/redaction-hmac-support.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: "enhancement"

# The name of the component, or a single word describing the area of concern, (e.g. receiver/filelog)
component: "processor/redaction"

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Add HMAC hash functions (`hmac-sha256` and `hmac-sha512`) for GDPR-compliant pseudonymization of sensitive data like IP addresses"

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [45715]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
HMAC functions provide rainbow table resistant hashing by using a secret key, making it impossible to reverse-engineer original values without the key.
This enables true pseudonymization per GDPR Article 4(5) requirements while maintaining consistency for pattern analysis.
Configure with `hash_function: hmac-sha256` (or `hmac-sha512`) and `hmac_key: "${env:REDACTION_SECRET_KEY}"`.

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
41 changes: 38 additions & 3 deletions processor/redactionprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,42 @@ The value is then masked according to the configuration.
`hash_function` defines the function for hashing values of matched keys or matches in values
instead of masking them with a fixed string. By default, no hash function is used
and masking with a fixed string is performed. The supported hash functions
are `md5`, `sha1` and `sha3` (SHA-256).
are `md5`, `sha1`, `sha3` (SHA-256), `hmac-sha256`, and `hmac-sha512`.

### HMAC Hash Functions

For enhanced security, especially when dealing with low-entropy data like IP addresses, HMAC (Hash-based Message Authentication Code) hash functions are recommended over simple hash functions like MD5, SHA1, or SHA3.

**Configuration Example:**

```yaml
processors:
redaction:
allow_all_keys: true
blocked_values:
- "(?:[0-9]{1,3}\\.){3}[0-9]{1,3}" # IPv4 addresses
- "(?:[0-9a-fA-F]{1,4}:){7}[0-9a-fA-F]{1,4}" # IPv6 addresses
hash_function: hmac-sha256 # or hmac-sha512
hmac_key: "${env:REDACTION_SECRET_KEY}" # Load from environment variable
summary: silent
```

**Key Management:**

```bash
# Generate a strong random key (do this once and store securely)
export REDACTION_SECRET_KEY=$(openssl rand -hex 32)

# Use the key when running the collector
./otelcol-contrib --config=config.yaml

# For production, store keys in:
# - Kubernetes Secrets
# - HashiCorp Vault
# - AWS Secrets Manager
# - Azure Key Vault
# Never commit keys to version control!
```

The `url_sanitizer` configuration enables sanitization of URLs in specified attributes by removing potentially sensitive information like UUIDs, timestamps, and other non-essential path segments. This is particularly useful for reducing cardinality in telemetry data while preserving the essential parts of URLs for troubleshooting.

Expand Down Expand Up @@ -200,7 +235,7 @@ Example configuration with database sanitization:
processors:
redaction:
# ... other redaction settings ...

# Database sanitization configuration
db_sanitizer:
# sanitize_span_name controls whether span names should be sanitized for database queries (default: true)
Expand All @@ -215,7 +250,7 @@ processors:
attributes: ["db.statement", "redis.command"]
memcached:
enabled: true
attributes: ["db.statement", "memcached.command"]
attributes: ["db.statement", "memcached.command"]
mongo:
enabled: true
attributes: ["db.statement", "mongodb.query"]
Expand Down
48 changes: 43 additions & 5 deletions processor/redactionprocessor/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ import (
"fmt"
"strings"

"go.opentelemetry.io/collector/config/configopaque"

"github.com/open-telemetry/opentelemetry-collector-contrib/processor/redactionprocessor/internal/db"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/redactionprocessor/internal/url"
)
Expand All @@ -18,10 +20,12 @@ var _ encoding.TextUnmarshaler = (*HashFunction)(nil)
type HashFunction string

const (
None HashFunction = ""
SHA1 HashFunction = "sha1"
SHA3 HashFunction = "sha3"
MD5 HashFunction = "md5"
None HashFunction = ""
SHA1 HashFunction = "sha1"
SHA3 HashFunction = "sha3"
MD5 HashFunction = "md5"
HMACSHA256 HashFunction = "hmac-sha256"
HMACSHA512 HashFunction = "hmac-sha512"
)

type Config struct {
Expand All @@ -44,6 +48,11 @@ type Config struct {
// and masking with a fixed string is performed.
HashFunction HashFunction `mapstructure:"hash_function"`

// HMACKey is the secret key used for HMAC hashing when HashFunction is set to hmac-sha256 or hmac-sha512.
// This should be loaded from a secure source like environment variables.
// Minimum length: 32 bytes for HMAC-SHA256, 64 bytes for HMAC-SHA512.
HMACKey configopaque.String `mapstructure:"hmac_key"`

// IgnoredKeys is a list of span attribute keys that are not redacted.
// Span attributes in this list are allowed to pass through the filter
// without being changed or removed.
Expand Down Expand Up @@ -101,9 +110,38 @@ func (u *HashFunction) UnmarshalText(text []byte) error {
case strings.ToLower(SHA3.String()):
*u = SHA3
return nil
case strings.ToLower(HMACSHA256.String()):
*u = HMACSHA256
return nil
case strings.ToLower(HMACSHA512.String()):
*u = HMACSHA512
return nil
case strings.ToLower(None.String()):
*u = None
return nil
}
return fmt.Errorf("unknown HashFunction %s, allowed functions are %s, %s and %s", str, SHA1, SHA3, MD5)
return fmt.Errorf("unknown HashFunction %s, allowed functions are %s, %s, %s, %s and %s", str, SHA1, SHA3, MD5, HMACSHA256, HMACSHA512)
}

// Validate validates the configuration
func (cfg *Config) Validate() error {
// Validate HMAC key requirements
if cfg.HashFunction == HMACSHA256 || cfg.HashFunction == HMACSHA512 {
key := string(cfg.HMACKey)
if key == "" {
return fmt.Errorf("hmac_key must not be empty when hash_function is %s", cfg.HashFunction)
}

// Enforce minimum key lengths for security
minLength := 32
if cfg.HashFunction == HMACSHA512 {
minLength = 64
}

if len(key) < minLength {
return fmt.Errorf("hmac_key must be at least %d bytes long for %s, got %d bytes", minLength, cfg.HashFunction, len(key))
}
}

return nil
}
101 changes: 101 additions & 0 deletions processor/redactionprocessor/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -107,3 +107,104 @@ func TestValidateConfig(t *testing.T) {
})
}
}

func TestValidateHMACKey(t *testing.T) {
tests := []struct {
name string
config *Config
expectError bool
errorContains string
}{
{
name: "valid HMAC-SHA256 with sufficient key length",
config: &Config{
HashFunction: HMACSHA256,
HMACKey: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", // 32 bytes
},
expectError: false,
},
{
name: "valid HMAC-SHA512 with sufficient key length",
config: &Config{
HashFunction: HMACSHA512,
HMACKey: "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb", // 64 bytes
},
expectError: false,
},
{
name: "empty key with HMAC-SHA256",
config: &Config{
HashFunction: HMACSHA256,
HMACKey: "",
},
expectError: true,
errorContains: "hmac_key must not be empty",
},
{
name: "empty key with HMAC-SHA512",
config: &Config{
HashFunction: HMACSHA512,
HMACKey: "",
},
expectError: true,
errorContains: "hmac_key must not be empty",
},
{
name: "key too short for HMAC-SHA256",
config: &Config{
HashFunction: HMACSHA256,
HMACKey: "short-key",
},
expectError: true,
errorContains: "hmac_key must be at least 32 bytes long",
},
{
name: "key too short for HMAC-SHA512",
config: &Config{
HashFunction: HMACSHA512,
HMACKey: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", // 32 bytes, too short for SHA512
},
expectError: true,
errorContains: "hmac_key must be at least 64 bytes long",
},
{
name: "no validation for non-HMAC hash functions",
config: &Config{
HashFunction: MD5,
HMACKey: "",
},
expectError: false,
},
{
name: "no validation when hash function is None",
config: &Config{
HashFunction: None,
HMACKey: "",
},
expectError: false,
},
{
name: "key with special characters is allowed",
config: &Config{
HashFunction: HMACSHA256,
HMACKey: "!@#$%^&*()_+-=[]{}|;:,.<>?",
},
expectError: true,
errorContains: "hmac_key must be at least 32 bytes long",
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := tt.config.Validate()
if tt.expectError {
assert.Error(t, err)
if tt.errorContains != "" {
assert.Contains(t, err.Error(), tt.errorContains)
}
} else {
assert.NoError(t, err)
}
})
}
}
1 change: 1 addition & 0 deletions processor/redactionprocessor/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ require (
github.com/stretchr/testify v1.11.1
go.opentelemetry.io/collector/component v1.50.1-0.20260121161034-55399d4743af
go.opentelemetry.io/collector/component/componenttest v0.144.1-0.20260121161034-55399d4743af
go.opentelemetry.io/collector/config/configopaque v1.50.1-0.20260121161034-55399d4743af
go.opentelemetry.io/collector/confmap v1.50.1-0.20260121161034-55399d4743af
go.opentelemetry.io/collector/confmap/xconfmap v0.144.1-0.20260121161034-55399d4743af
go.opentelemetry.io/collector/consumer v1.50.1-0.20260121161034-55399d4743af
Expand Down
2 changes: 2 additions & 0 deletions processor/redactionprocessor/go.sum

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 14 additions & 0 deletions processor/redactionprocessor/processor.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,19 @@ package redactionprocessor // import "github.com/open-telemetry/opentelemetry-co
//nolint:gosec
import (
"context"
"crypto/hmac"
"crypto/md5"
"crypto/sha1"
"crypto/sha256"
"crypto/sha512"
"encoding/hex"
"fmt"
"hash"
"regexp"
"sort"
"strings"

"go.opentelemetry.io/collector/config/configopaque"
"go.opentelemetry.io/collector/pdata/pcommon"
"go.opentelemetry.io/collector/pdata/plog"
"go.opentelemetry.io/collector/pdata/pmetric"
Expand Down Expand Up @@ -398,6 +402,10 @@ func (s *redaction) maskValue(val string, regex *regexp.Regexp) string {
return hashString(match, sha3.New256())
case MD5:
return hashString(match, md5.New())
case HMACSHA256:
return hashStringHMAC(match, s.config.HMACKey, sha256.New)
case HMACSHA512:
return hashStringHMAC(match, s.config.HMACKey, sha512.New)
default:
return "****"
}
Expand All @@ -410,6 +418,12 @@ func hashString(input string, hasher hash.Hash) string {
return hex.EncodeToString(hasher.Sum(nil))
}

func hashStringHMAC(input string, key configopaque.String, newHash func() hash.Hash) string {
h := hmac.New(newHash, []byte(string(key)))
h.Write([]byte(input))
return hex.EncodeToString(h.Sum(nil))
}

// addMetaAttrs adds diagnostic information about redacted or masked attribute keys
func (s *redaction) addMetaAttrs(redactedAttrs []string, attributes pcommon.Map, valuesAttr, countAttr string) {
redactedCount := int64(len(redactedAttrs))
Expand Down
Loading