Skip to content

Conversation

@kyoungbinkim
Copy link
Contributor

Change Description

Adds support for detecting MAC (Media Access Control) addresses as a new global PII entity type. This recognizer supports three industry-standard MAC address formats commonly used across network devices and documentation.

  • Colon-separated: 00:1A:2B:3C:4D:5E (IEEE 802 standard)
  • Hyphen-separated: 00-1A-2B-3C-4D-5E (Windows format)
  • Cisco dot-separated: 0012.3456.789A (Cisco IOS format)

Changes Made

  • Added MacAddressRecognizer in presidio_analyzer/predefined_recognizers/generic/
  • Pattern-based detection with 0.6 base confidence score
  • Context-aware scoring using keywords: mac, mac address, hardware address, physical address, ethernet
  • Validation logic to reject invalid formats and broadcast addresses (FF:FF:FF:FF:FF:FF)
  • Registered recognizer in predefined_recognizers/__init__.py
  • Updated docs/supported_entities.md with new MAC_ADDRESS entity type
  • Comprehensive test suite with 15 test cases covering all formats and edge cases

Checklist

  • I have reviewed the contribution guidelines
  • I have signed the CLA (if required)
  • My code includes unit tests
  • All unit tests and lint checks pass locally
  • My PR contains documentation updates / additions if required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant