T8963: policy-route: trigger domain resolver for domain groups#5254
T8963: policy-route: trigger domain resolver for domain groups#5254jd82k wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🧰 Additional context used🔍 Remote MCPRelevant Context for Pull Request
|
| Layer / File(s) | Summary |
|---|---|
Domain resolver service lifecycle management src/conf_mode/policy_route.py (import glob ~L19; write_file, call imports ~L29-30; constants ~L44-45; logic ~L195-230) |
Adds glob/write_file/call imports, defines domain_resolver_usage = '/run/use-vyos-domain-resolver-policy-route' and a glob, implements domain_group_used(policy) to scan enabled policy/route and policy/route6 for domain_group, and update_domain_resolver(policy) to write/remove the marker and run systemctl restart or conditional systemctl stop. |
Domain resolver integration in apply flow src/conf_mode/policy_route.py (apply wiring ~L284-285) |
Calls update_domain_resolver(policy) immediately after apply_table_marks(policy) and before GeoIP refresh/update handling. |
Domain group smoke tests smoketest/scripts/cli/test_policy_route.py (import ~L21-22; tearDown edits ~L58-66; test ~L102-141) |
Adds run import (~L21-22), extends TestPolicyRoute.tearDown() to remove smoketest_domain firewall domain-group and the pbr.example.com static-host-mapping (commit), and adds test_pbr_domain_group() which creates the static-host mapping and domain-group, installs a PBR destination-match rule that sets mark, polls nft get element until the domain-derived IP appears in D_smoketest_domain, and asserts the nftables jump, set membership/element, destination match against @D_smoketest_domain, and meta mark set. |
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |
✅ Passed checks (4 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | Title clearly and concisely summarizes the main change: fixing policy-route to trigger domain resolver for domain groups. |
| Description check | ✅ Passed | Description details the bug fix, root cause, solution approach, and includes smoketest results directly related to the changeset. |
| Linked Issues check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Out of Scope Changes check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
✨ Simplify code
- Create PR with simplified code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
|
✅ No typos found in changed files. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/conf_mode/policy_route.py (1)
213-228: 💤 Low valueLogic check on resolver lifecycle: OK.
glob('/run/use-vyos-domain-resolver*')runs after the policy-route marker is unlinked, so the stop decision correctly ignores this script's own marker and respects other consumers (firewall/NAT). Restart-on-every-commit when a domain group is present matches existing firewall/NAT behavior.Note: unconditional
restarton each commit briefly drops resolved sets for all consumers mid-resolution. Acceptable for parity, but areload/no-op-when-unchanged path would be gentler if the service supports it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/conf_mode/policy_route.py` around lines 213 - 228, The update_domain_resolver function currently unconditionally sets domain_action = 'restart' (when domain_group_used(policy) or marker exists) which briefly drops resolved sets; change the invoked action to a gentler option by using systemctl reload-or-restart (or reload if the service supports a true reload) instead of plain restart: update the logic in update_domain_resolver so that when you would set domain_action = 'restart' you set domain_action = 'reload-or-restart' (or 'reload' if you verify vyos-domain-resolver supports it), then call call(f'systemctl {domain_action} vyos-domain-resolver.service'); keep references to domain_resolver_usage and domain_group_used(policy) intact so other consumers are respected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/conf_mode/policy_route.py`:
- Around line 217-218: Two f-strings that have no interpolation should be plain
strings: change the assignment to variable text (the string passed to
write_file(domain_resolver_usage, text)) from f'# Automatically generated by
policy_route.py\nThis file ...\n' to a regular string without the f-prefix, and
similarly change the ConfigError raise in the ConfigError(...) call (raise
ConfigError(f'Cannot match a tcp flag as set and not set')) to remove the
f-prefix so it reads raise ConfigError('Cannot match a tcp flag as set and not
set'); this removes the unnecessary f-prefixes flagged by ruff F541.
---
Nitpick comments:
In `@src/conf_mode/policy_route.py`:
- Around line 213-228: The update_domain_resolver function currently
unconditionally sets domain_action = 'restart' (when domain_group_used(policy)
or marker exists) which briefly drops resolved sets; change the invoked action
to a gentler option by using systemctl reload-or-restart (or reload if the
service supports a true reload) instead of plain restart: update the logic in
update_domain_resolver so that when you would set domain_action = 'restart' you
set domain_action = 'reload-or-restart' (or 'reload' if you verify
vyos-domain-resolver supports it), then call call(f'systemctl {domain_action}
vyos-domain-resolver.service'); keep references to domain_resolver_usage and
domain_group_used(policy) intact so other consumers are respected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 2154e174-54bd-4c06-b987-aec6f83d2a49
📒 Files selected for processing (2)
smoketest/scripts/cli/test_policy_route.pysrc/conf_mode/policy_route.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build_iso
- GitHub Check: Mergify Merge Protections
- GitHub Check: Summary
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Python code must target Python 3.11 or higher
Prefer storing Jinja2 templates as discrete files underdata/templates/rather than inline Python strings
Use ruff (version 0.6.4), darker, pylint W0611, and Jinja2 lint for linting; configuration viaruff.tomlandnose2.cfgat repository root
Files:
smoketest/scripts/cli/test_policy_route.pysrc/conf_mode/policy_route.py
smoketest/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
smoketest/**/*.py: Usenose2for running tests with configuration innose2.cfg
Runtime smoketests must be located undersmoketest/and are used by vyos-build when assembling and testing ISO images
Files:
smoketest/scripts/cli/test_policy_route.py
src/conf_mode/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Conf-mode entry-point scripts must be named after CLI components and placed in
src/conf_mode/
Files:
src/conf_mode/policy_route.py
🧠 Learnings (3)
📚 Learning: 2026-06-01T00:03:28.710Z
Learnt from: CR
Repo: vyos/vyos-1x PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-06-01T00:03:28.710Z
Learning: Applies to smoketest/**/*.py : Runtime smoketests must be located under `smoketest/` and are used by vyos-build when assembling and testing ISO images
Applied to files:
smoketest/scripts/cli/test_policy_route.py
📚 Learning: 2026-05-26T06:04:29.163Z
Learnt from: c-po
Repo: vyos/vyos-1x PR: 5109
File: smoketest/scripts/cli/test_service_https.py:118-120
Timestamp: 2026-05-26T06:04:29.163Z
Learning: In VyOS smoketest scripts under `smoketest/scripts/cli/`, it is intentional to call `self.cli_delete(['vrf'])` in both `setUpClass` and `tearDown` to wipe the entire VRF subtree and ensure a clean slate. During code review, do not recommend narrowing the delete to specific VRF identifiers or name subsets (e.g., `['vrf', 'name', 'mgmt']`)—the broad teardown behavior is the established project-wide pattern for these tests.
Applied to files:
smoketest/scripts/cli/test_policy_route.py
📚 Learning: 2026-05-26T06:03:59.703Z
Learnt from: c-po
Repo: vyos/vyos-1x PR: 5109
File: smoketest/scripts/cli/test_service_https.py:206-207
Timestamp: 2026-05-26T06:03:59.703Z
Learning: In VyOS smoketests that verify processes running inside a VRF using iproute2, remember that `ip vrf pids <vrf>` outputs one entry per line as `<pid> <process_name>` (e.g., `300431 nginx`), not PIDs alone. Therefore, assertions should check for the presence of the expected process name in the command output (e.g., `assertIn(PROCESS_NAME, cmd(f'ip vrf pids {vrf}'))`) rather than trying to match PID-only output.
Applied to files:
smoketest/scripts/cli/test_policy_route.py
🔍 Remote MCP
Summary of Relevant PR Context
Based on the research performed, here is the relevant context for effectively reviewing PR #5254:
Problem Context
The source/destination domain-group matcher is available in route policy, but it wasn't working—domain-group usage was written in the rule, but the named_set was not defined in the mangle table (it was only written in the vyos_filter table).
Domain-Group Architecture
Firewall groups can be referenced in firewall, NAT, and policy route rules as either a source or destination matcher. The domain resolver runs as a system service (vyos-domain-resolver.service), periodically resolves configured domain names to IP addresses, updates nftables sets with the resolved IP addresses, and is automatically started/stopped based on firewall configuration.
How Domain-Groups Work in VyOS Firewall
Domain groups allow filtering addresses by domain name, with resolved addresses stored as named "nft sets" used in nftables rules. The resolver is a systemd daemon (vyos-domain-group-resolve.service) that periodically resolves domain-group addresses by timeout every 300 seconds.
The Root Issue This PR Fixes
The PR addresses the gap where policy-route configuration with domain-groups did not trigger vyos-domain-resolver, meaning the nftables sets (D_* sets) were created empty and populated only on the resolver's next periodic interval (300-second timeout), causing PBR rules using domain groups to be ineffective for several minutes after commit.
Precedent Pattern
Previous work established using named sets in nftables instead of anonymous sets for firewall groups, and refactored nftables cleanup code. The PR extends this pattern to policy-route domain-group handling by aligning it with existing firewall/NAT behavior.
Test Coverage
The included smoketest verifies that a PBR rule using a domain group gets its ip vyos_mangle set populated promptly (passing in 33.669s), which is critical validation that the domain resolver is properly triggered on commit.
🔇 Additional comments (6)
src/conf_mode/policy_route.py (3)
19-19: LGTM!Also applies to: 29-30
194-211: LGTM!
283-283: LGTM!smoketest/scripts/cli/test_policy_route.py (3)
19-23: LGTM!Also applies to: 60-61
82-91: LGTM!
114-140: LGTM!
There was a problem hiding this comment.
Pull request overview
Ensures policy-based routing (PBR) commits that reference firewall domain-group sets promptly trigger vyos-domain-resolver, so the D_* nftables sets used in ip vyos_mangle are populated immediately rather than waiting for the resolver’s periodic refresh.
Changes:
- Add resolver-usage tracking to
policy_route.pyand restart/stopvyos-domain-resolver.servicebased on whether PBR rules referencedomain-group. - Add a smoketest that configures a PBR rule using a domain group and waits for the corresponding
D_*set to be populated.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/conf_mode/policy_route.py |
Detects PBR domain-group usage and restarts/stops vyos-domain-resolver.service via /run/use-vyos-domain-resolver-policy-route. |
smoketest/scripts/cli/test_policy_route.py |
Adds a smoketest for PBR + domain-group and waits for resolver-populated nftables set elements. |
4b2558d to
cd4daf6
Compare
sarthurdev
left a comment
There was a problem hiding this comment.
Triggers resolver restart on policy change. Smoketest added/passes, tested locally.
Signed-off-by: Miaosen Wang <secretandanon@gmail.com>
|
CI integration ❌ failed! Details
|
Change summary
Fix policy route domain-group handling so vyos-domain-resolver is triggered after PBR nftables sets are recreated.
Previously, PBR rules could reference firewall domain-group sets in ip vyos_mangle, but policy_route.py did not notify vyos-domain-resolver. As a result, D_* sets were created empty and only populated on the resolver’s next periodic interval, causing PBR rules using domain groups to be ineffective for several minutes after commit.
This change adds policy-route resolver usage tracking and restarts/stops vyos-domain-resolver.service consistently with the existing firewall/NAT behavior. A smoketest was added to verify that a PBR rule using a domain group gets its ip vyos_mangle set populated promptly.
Types of changes
Related Task(s)
https://vyos.dev/T8963
Related PR(s)
How to test / Smoketest result
Checklist: