Why This Matters
The regex stack in util.py:86-125 (ENTITY_ID_PATTERN, ENTITY_ID_TEMPLATE_PATTERNS) plus extract_entities_from_template_regex (line 499), split_comma_separated_entity_ids (line 628), and extract_template_strings_from_config (line 651) are the engine behind every unknown_*_references repair in ectoplasms/automation/repairs/, ectoplasms/script/repairs/, ectoplasms/lovelace/repairs/, etc. They are pure(-ish) functions with rich edge cases (comma-separated lists, nested templates, states.domain.entity direct access, expand(...) filters, IGNORED_ENTITY_DOMAINS prefix matching). Pure functions = highest test ROI: no hass fixture overhead, parametrized tables cover dozens of cases in milliseconds, and any future regex tweak (a recurring source of false-positive repairs — see commit 29fd59d) gets immediate feedback.
Approach
Add tests/test_util.py with three parametrized test classes: (a) test_extract_entities_from_template_regex over a table of ~20 templates including {{ states('light.kitchen') }}, {{ state_attr('sensor.x', 'foo') }}, {{ expand('group.lights') }}, states.binary_sensor.door.state, multi-entity strings, malformed templates, and false-positive bait like service names. (b) test_split_comma_separated_entity_ids covering empty/None/single/multi/whitespace edge cases. (c) test_extract_template_strings_from_config over dicts, lists, nested structures, and non-string values. For functions that take hass (e.g. async_filter_known_entity_ids), use a minimal hass fixture and seed er.async_get(hass) with known entities.
Acceptance Criteria
Risks & Caveats
These regexes are deliberately permissive to catch unknown entities — be careful not to write tests that lock down behavior the maintainer actually wants flexible. Pair with the maintainer on the table of cases. The KNOWN_DOMAINS list (util.py:84) is a hard-coded enum; tests should treat it as a fixture so domain additions don't silently break.
Scores
- Impact: ██████░░░░ 6/10
- Difficulty: ███░░░░░░░ 3/10
- Short-Term ROI: █████████░ 9/10
- Long-Term Value: ███████░░░ 7/10
Priority
Prototype First
Dependencies
#1257
Why This Matters
The regex stack in
util.py:86-125(ENTITY_ID_PATTERN,ENTITY_ID_TEMPLATE_PATTERNS) plusextract_entities_from_template_regex(line 499),split_comma_separated_entity_ids(line 628), andextract_template_strings_from_config(line 651) are the engine behind everyunknown_*_referencesrepair inectoplasms/automation/repairs/,ectoplasms/script/repairs/,ectoplasms/lovelace/repairs/, etc. They are pure(-ish) functions with rich edge cases (comma-separated lists, nested templates,states.domain.entitydirect access,expand(...)filters, IGNORED_ENTITY_DOMAINS prefix matching). Pure functions = highest test ROI: nohassfixture overhead, parametrized tables cover dozens of cases in milliseconds, and any future regex tweak (a recurring source of false-positive repairs — see commit29fd59d) gets immediate feedback.Approach
Add
tests/test_util.pywith three parametrized test classes: (a)test_extract_entities_from_template_regexover a table of ~20 templates including{{ states('light.kitchen') }},{{ state_attr('sensor.x', 'foo') }},{{ expand('group.lights') }},states.binary_sensor.door.state, multi-entity strings, malformed templates, and false-positive bait like service names. (b)test_split_comma_separated_entity_idscovering empty/None/single/multi/whitespace edge cases. (c)test_extract_template_strings_from_configover dicts, lists, nested structures, and non-string values. For functions that takehass(e.g.async_filter_known_entity_ids), use a minimalhassfixture and seeder.async_get(hass)with known entities.Acceptance Criteria
extract_entities_from_template_regex, including known-false-positive baitsplit_comma_separated_entity_idscovers all branches in the functionIGNORED_ENTITY_DOMAINSfiltering is asserted (noscene.*orgroup.*leaks)29fd59d("Find unknown entities in heading badge") to prevent regressionRisks & Caveats
These regexes are deliberately permissive to catch unknown entities — be careful not to write tests that lock down behavior the maintainer actually wants flexible. Pair with the maintainer on the table of cases. The
KNOWN_DOMAINSlist (util.py:84) is a hard-coded enum; tests should treat it as a fixture so domain additions don't silently break.Scores
Priority
Prototype First
Dependencies
#1257