Skip to content

feat(loki): Add regex-based log masking for sensitive data protection#491

Open
Marie673 wants to merge 21 commits intografana:mainfrom
Marie673:main
Open

feat(loki): Add regex-based log masking for sensitive data protection#491
Marie673 wants to merge 21 commits intografana:mainfrom
Marie673:main

Conversation

@Marie673
Copy link

@Marie673 Marie673 commented Jan 16, 2026

feat(loki): Add server-side log masking for sensitive data protection

Summary

Add regex-based log masking capability to the Loki log query tool. This feature automatically masks sensitive information (PII, authentication tokens, etc.) in log data before returning results to AI assistants, enhancing security and compliance.

Key Design Decision: Masking configuration is set at server startup by system administrators. AI/LLM cannot modify or override masking settings via tool calls. This ensures organizational security policies are enforced consistently and prevents unintended data exposure.

Changes

New Files

File Description
tools/loki_masking.go Core masking logic and builtin pattern definitions
tools/loki_masking_unit_test.go Comprehensive unit tests, benchmarks, SLO compliance tests

Modified Files

File Description
tools/loki.go Integration of masking processing via Context-based masker injection
tools/loki_test.go Integration tests for masking functionality
mcpgrafana.go LokiMaskingConfig struct, WithMasker/MaskerFromContext context functions
cmd/mcp-grafana/main.go CLI flags, environment variable support, masker initialization
mcpgrafana_test.go Context management tests
README.md Documentation for masking feature and configuration
examples/tls_example.go Context function signature updates

Features

Server-Level Configuration

Masking is configured at server startup via environment variables:

# Enable masking
export LOKI_MASKING_ENABLED=true

# Specify patterns (comma-separated)
export LOKI_MASKING_PATTERNS=email,credit_card,ip_address,jwt_token

Or via CLI flags:

./mcp-grafana \
  --loki-masking-enabled \
  --loki-masking-patterns=email,credit_card,ip_address,jwt_token

Or via Docker Compose:

environment:
  - LOKI_MASKING_ENABLED=true
  - LOKI_MASKING_PATTERNS=email,credit_card,ip_address,jwt_token

Builtin Patterns

The following sensitive data patterns are provided as predefined options:

Pattern ID Description Example
email Email addresses user@example.com[MASKED:email]
phone E.164 international phone numbers +819012345678[MASKED:phone]
credit_card Credit card numbers 4111-1111-1111-1111[MASKED:credit_card]
ip_address IPv4/IPv6 addresses 192.168.1.1[MASKED:ip_address]
mac_address MAC addresses 00:1A:2B:3C:4D:5E[MASKED:mac_address]
api_key API keys/tokens api_key=abc123...[MASKED:api_key]
jwt_token JWT tokens eyJ...[MASKED:jwt_token]

Security Model

Aspect Detail
Configuration by System administrators only
Configuration timing Server startup only
Configuration method Environment variables / CLI flags
AI/LLM access to settings Not possible
Modification via tool calls Not possible (no masking parameter in API)

Architecture

Context-Based Masker Injection

Server Startup → Load Config → Create LogMasker → Inject into Context
                                                         ↓
MCP Client → query_loki_logs → Fetch Logs → Apply Masker from Context → Return Masked Logs

The masking is applied automatically when enabled - AI assistants receive masked logs without any ability to bypass or modify the masking configuration.

Technical Details

Performance

  • Maximum pattern count: 20
  • SLO: 100 entries × 20 patterns completed within 100ms
  • Regex patterns compiled once at server startup (RE2 engine with linear time guarantee)

Error Handling

  • Invalid builtin pattern identifier at startup → Warning log, skip invalid patterns
  • Pattern count exceeds limit → Warning log, apply first 20 patterns only
  • Masking failure at runtime → Return error, never return unmasked data

Backward Compatibility

  • Masking is disabled by default (LOKI_MASKING_ENABLED=false)
  • When disabled, behavior is identical to previous versions
  • No changes to query_loki_logs tool API (no new parameters added)

Breaking Changes

  • ComposedStdioContextFunc, ComposedSSEContextFunc, ComposedHTTPContextFunc signatures changed (added masker any parameter)
  • If using these functions directly from external code, pass nil as the second argument to maintain existing behavior

Test Plan

Unit Tests

  • Builtin pattern matching tests (positive/negative cases for each pattern)
  • LogMasker masking behavior tests
  • MaskingConfig validation tests
  • Pattern compilation and caching verification
  • Performance benchmarks (SLO compliance)

Integration Tests

  • Query execution with masking enabled
  • Backward compatibility with masking disabled
  • Context-based masker injection verification

Non-Goals (Intentionally Not Supported)

  • Custom regex patterns (for security and simplicity)
  • Tool-level masking parameter (AI cannot modify settings)
  • Local phone number formats (E.164 international format only)
  • Label value masking (log body only)

- Add MaskingConfig struct for log masking configuration

- Add MaskingPattern struct for custom regex patterns

- Support builtin pattern identifiers (email, phone, credit_card, etc.)

- Support custom regex patterns with optional replacements

- Add GlobalReplacement and HidePatternType options

- Add unit tests for JSON serialization and field validation
- Add builtin pattern ID constants (email, phone, credit_card, ip_address, mac_address, api_key, jwt_token)
- Register precompiled regex patterns in builtinPatterns map
- Add GetBuiltinPattern/IsValidBuiltinPattern helper functions
- Add comprehensive unit tests for each pattern matching
- Add ValidateMaskingConfig for pattern validation
  - Check pattern count limit (max 20)
  - Validate builtin pattern identifiers
  - Validate custom regex syntax
- Add LogMasker struct with compiled patterns
- Add NewLogMasker factory function
- Add MaskEntries method for applying masks to log entries
  - Support GlobalReplacement override
  - Support HidePatternType option
  - Disable back-references (literal replacement)
- Add comprehensive unit tests for:
  - Validation (valid/invalid configs, limits)
  - LogMasker creation and configuration
  - MaskEntries with builtin/custom patterns
  - Edge cases (empty, unicode, overlapping)
- Add Masking field to QueryLokiLogsParams struct
- Add applyMaskingToEntries helper function
- Validate masking config before log fetch (early failure)
- Apply masking after log retrieval, before returning
- Add comprehensive unit tests for integration
- Add TestQueryLokiLogs_WithMasking for masking functionality

- Add TestQueryLokiLogs_WithoutMasking for backward compatibility

- Add TestQueryLokiLogs_MaskingValidationError for error handling

- Verify builtin and custom pattern combinations work correctly

- Ensure validation errors prevent unmasked data leaks
- Add BenchmarkLogMasker_MaskEntries_100Entries for 100 entries with builtin patterns
- Add BenchmarkLogMasker_MaskEntries_20Patterns for max pattern limit (20)
- Add BenchmarkLogMasker_MaskEntries_SLO for SLO verification
- Add BenchmarkLogMasker_PatternCompilation for compilation time
- Add TestSLOCompliance to verify 100ms SLO (actual: ~1.5ms)
- Add TestPatternCompilationOnce to verify pattern reuse
- Add helper function generateTestLogEntries for realistic test data
- Remove redundant doc comments from types, functions, and variables
- Remove task reference comments from test files
- Remove Japanese comments from pattern definitions
- Keep section separator comments for readability
- Align comment style with existing codebase (prometheus.go)
@CLAassistant
Copy link

CLAassistant commented Jan 16, 2026

CLA assistant check
All committers have signed the CLA.

- Eliminate outdated integration tests from loki_test.go and loki_masking_unit_test.go
- Streamline test cases to focus on essential functionality and validation
- Ensure backward compatibility and error handling are still covered in remaining tests
- Remove support for custom patterns in MaskingConfig for security and simplicity.
- Update validation logic to only consider builtin patterns.
- Refactor tests to align with the new configuration, focusing on builtin patterns only.
- Streamline JSON serialization and pattern handling in the masking implementation.
- Introduce LokiMaskingConfig struct to manage log masking settings.
- Add ExtractLokiMaskingFromEnv function to read configuration from environment variables.
- Implement validation and filtering methods for masking patterns.
- Create comprehensive unit tests for environment variable extraction and pattern validation.
- Enhance CLI integration for Loki masking configuration with appropriate flags.
…ted functions

- Add WithMasker and MaskerFromContext functions for managing LogMasker in context.
- Update Composed*ContextFunc functions to support optional LogMasker parameter.
- Enhance tests for LogMasker context functions to ensure proper retrieval and handling.
- Modify main application logic to integrate LogMasker into server context setup.
- Add tests for LogMasker context management, ensuring proper storage and retrieval.
- Implement tests for backward compatibility when no masker is set in context.
- Validate masking behavior when a masker is applied via context.
- Ensure log entries remain unchanged when no masker or a nil masker is present.
- Introduce new test cases for backward compatibility with nil masker in context.
- Validate log entry structure and masking behavior across various scenarios.
- Ensure proper handling of empty results and format validation for masked entries.
- Enhance tests for single and multiple masking patterns, confirming expected behavior.
- Introduce a new section on log masking in the README, outlining its purpose and configuration.
- Detail the command-line flags for enabling log masking and specifying masking patterns.
- Provide a table of supported masking patterns and an example of environment variable setup for users.
@Marie673 Marie673 marked this pull request as ready for review January 16, 2026 16:17
@Marie673 Marie673 requested a review from a team as a code owner January 16, 2026 16:17
Marie673 and others added 7 commits January 17, 2026 01:59
- Modify test cases in loki_test.go to replace 'container' label with 'job' label in LogQL queries.
- Ensure assertions reflect the change from container to job label values for consistency in testing.
…d masking function

- Simplify the lokiMaskingConfig struct by reordering fields for clarity.
- Remove the applyMaskingToEntries function as it is no longer needed.
- Ensure the codebase remains clean and maintainable by eliminating redundant code.
- Eliminate the TestApplyMaskingToEntries function from loki_masking_unit_test.go as it is no longer needed.
- Ensure the remaining tests continue to validate the masking functionality effectively.
- Modify test cases in loki_test.go to replace 'job' label with 'container' label in LogQL queries.
- Update assertions to reflect the change from job to container label values for consistency in testing.
- Ensure that tests validate the expected behavior with the new label context.
- Modify test cases in prometheus_test.go to incorporate label matching using the 'job' label.
- Update assertions to ensure the expected number of label names and values are returned.
- Enhance test reliability by directly specifying label filters in the test parameters.
- queryLokiLogs関数のドキュメントコメントを復元 理由: commit 971c482で誤って削除してしまったため
@gitdoluquita gitdoluquita self-requested a review January 23, 2026 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants