[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations for gh-aw-firewall #1289

2026-03-13T03:25:16Z

github-actions[bot]
bot Mar 13, 2026

📊 Executive Summary

gh-aw-firewall is already well ahead of the average repository with 21 agentic workflow definitions spanning security, CI, documentation, and testing automation. The repository is notably strong in security-focused automation (daily secret scanning with 3 engines, daily security review, PR security guard), yet several high-value patterns from Pelis Agent Factory are missing — especially around issue triage, meta-agent observability, breaking-change detection, and firewall-domain-specific escape testing. Closing these gaps could meaningfully reduce maintainer toil and strengthen the security posture.

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns from the Documentation Site

Explored the full Pelis Agent Factory blog series across 13+ articles covering 100+ workflows in github/gh-aw. Standout learnings:

Pattern Category	Key Insight
Specialization	Focused single-purpose agents outperform monolithic ones — each workflow should do one thing extremely well
Meta-Agents	Agents that watch other agents (Metrics Collector, Audit Workflows, Workflow Health Manager) become among the most valuable as a fleet grows
Causal Chain Automation	The highest-ROI pattern: one agent creates an issue → another agent (Issue Monster) picks it up → a coding agent fixes it → merged PR. 69–100% merge rates observed
Cache Memory	Persistent state across runs (issue signatures, domain lists, historical metrics) dramatically improves agent effectiveness
Skip-If-Match	Prevent workflow flooding by skipping if a similar PR/issue already exists
Scheduled + Event Triggers	Combine scheduled runs with event triggers (issues.opened, pull_request) for maximum coverage
Read-Only vs Write	Keep most workflows read-only analysts; limit write permissions to specific safe-output tools

Key Patterns from the `githubnext/agentics` Repository

Explored the agentics workflows directory with 40+ workflow templates. Notable templates applicable here:

ci-coach.md — CI optimization with 100% merge rate in gh-aw
daily-malicious-code-scan.md — Suspicious pattern detection in recent commits
daily-test-improver.md — Incremental test coverage improvements
issue-arborist.md — Automatic sub-issue linking
grumpy-reviewer.md — Adversarial code quality PR reviewer

How This Repo Compares

gh-aw-firewall has clearly been built using the factory patterns. It has the CI Doctor, Issue Monster, Doc Maintainer, Security Guard, Smoke Tests, and Dependency Security Monitor — all hallmark Pelis patterns. The gaps are primarily in: (1) observability/meta-agents, (2) issue management automation, and (3) domain-specific firewall validation automation.

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Build & test on PRs via copilot agent	PR + manual	✅ Well-configured, multi-language
`ci-doctor`	Investigates CI failures, creates issues	`workflow_run` failure on main	✅ Strong — monitors 27 workflows
`ci-cd-gaps-assessment`	Daily CI/CD pipeline gap analysis	Daily + manual	✅ Good for continuous improvement
`cli-flag-consistency-checker`	Checks CLI docs vs implementation	Weekly	✅ Domain-appropriate
`dependency-security-monitor`	Monitors npm deps for CVEs	Daily	✅ With PR creation for patches
`doc-maintainer`	Syncs docs with recent code changes	Daily	✅ With skip-if-match guard
`issue-duplication-detector`	Detects duplicate issues with cache	`issues.opened`	✅ Uses cache-memory pattern
`issue-monster`	Assigns issues to Copilot coding agent	Hourly + `issues.opened`	✅ Core task dispatcher
`plan`	`/plan` slash command for task breakdowns	Slash command	✅ ChatOps pattern
`secret-digger-claude/codex/copilot`	Hourly secret scanning (3 engines)	Hourly (staggered)	✅ Multi-engine — impressive coverage
`security-guard`	Reviews PRs for security regressions	PR + manual	✅ Claude-powered, well-scoped
`security-review`	Daily comprehensive security + threat modeling	Daily	✅ Uses audit tool + cache memory
`smoke-chroot/claude/codex/copilot`	End-to-end smoke tests (4 variants)	PR + scheduled every 12h	✅ Excellent multi-engine validation
`test-coverage-improver`	Weekly test coverage gap PRs	Weekly	✅ Security-focused scope
`update-release-notes`	Enriches release notes on publish	`release.published`	✅ Simple, effective
`pelis-agent-factory-advisor`	This workflow	Daily	✅ Meta-advisory

Total: 21 agentic workflow definitions (counting 4 smoke variants separately)

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1 — Firewall Escape Test Agent

What: A dedicated agentic workflow that runs the actual firewall and attempts common bypass techniques (DNS tunneling, HTTP smuggling, non-standard ports, IPv6 bypasses, protocol tunneling) using real container execution. Distinct from the current smoke tests which verify functionality — this adversarially tests security boundaries.

Why: The entire value proposition of this library is that it cannot be escaped. There's an open issue (#1039) about integration test gaps, and the security-review workflow references wanting escape attempt data. A dedicated escape-test agent would feed findings directly back into the security-review workflow via cache-memory. In the Pelis Factory, the Firewall workflow has created 59 daily reports and 5 issues — directly analogous.

How:

---
name: Firewall Escape Test Agent
description: Daily adversarial testing of firewall bypass techniques
on:
  schedule: daily
  workflow_dispatch:
permissions:
  contents: read
  actions: read
  issues: read
network:
  allowed:
    - github
    - docker
safe-outputs:
  create-issue:
    title-prefix: "[Escape Test] "
    labels: [security, firewall-test]
    expires: 14
  create-discussion:
    title-prefix: "[Firewall Report] "
    category: general
cache-memory: true
timeout-minutes: 45
---
# Firewall Escape Test Agent
Test that the AWF firewall correctly blocks unauthorized traffic...

Effort: Medium (requires Docker access in the workflow, similar to smoke tests)

P0.2 — Issue Triage Agent

What: Automatically labels new issues as bug, feature, question, security, documentation, etc. and leaves a welcoming comment explaining the classification.

Why: There are currently 10 open issues with varied labels. The Issue Monster (task dispatcher) works better when issues have proper labels. This is the "hello world" of Pelis patterns and was the first workflow described in the blog series — it's surprisingly impactful. Currently this repo has no auto-triage.

How:

---
name: Issue Triage
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, feature, security, documentation, question, help-wanted, good-first-issue, enhancement]
  add-comment: {}
timeout-minutes: 5
---
# Issue Triage Agent
Analyze new issues in $\{\{ github.repository }} and apply the most appropriate label...

Effort: Low — one of the simplest patterns, directly addable via gh aw add-wizard githubnext/agentics/issue-triage

P0.3 — Daily Malicious Code Scan

What: Scans recent code commits (past 24h) for suspicious patterns: unusual network calls, obfuscated code, unauthorized capability usage, credential harvesting patterns, and supply chain attack indicators.

Why: This repo is a security tool — if it were compromised, the blast radius is enormous (every repo using awf). The Pelis Factory runs this daily and it's listed as one of the security guardian workflows. The existing security-review is broad; this is laser-focused on detecting malicious code injection.

How: Add via gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/daily-malicious-code-scan.md and customize for TypeScript/Node.js patterns.

Effort: Low — direct port from gh-aw with minor customization

P1 — Plan for Near-Term

P1.1 — Workflow Health Manager (Meta-Agent)

What: A weekly meta-agent that reviews the health of all other agentic workflows: checks for failures, no-op patterns, stale skip-if-match guards, and workflows that haven't run recently.

Why: Currently there are several workflow failures visible in open issues (#1274–1287): Smoke Chroot, Smoke Copilot, Smoke Claude, Smoke Codex, Security Guard all have "failed" issues open. A Workflow Health Manager would aggregate these, create prioritized issues, and propose fixes. In the Pelis Factory this workflow created 40 issues and drove 19 merged PRs.

How:

---
name: Workflow Health Manager
description: Weekly meta-agent monitoring health of all agentic workflows
on:
  schedule: weekly
  workflow_dispatch:
permissions:
  contents: read
  actions: read
  issues: read
tools:
  agentic-workflows:
  github:
    toolsets: [default, actions]
  cache-memory: true
safe-outputs:
  create-issue:
    title-prefix: "[Workflow Health] "
    labels: [ci, maintenance]
    expires: 14
timeout-minutes: 20
---
# Monitor the health of all agentic workflows...

Effort: Low-Medium (can largely reuse the ci-doctor pattern)

P1.2 — Breaking Change Checker

What: Monitors PRs and recent commits for changes that could break the public API contract: removed CLI flags, changed flag semantics, altered container behavior, modified network topology, breaking Docker Compose schema changes.

Why: This is a security firewall used by other teams. A breaking change to --allow-domains parsing or iptables setup could silently break user security posture. The Pelis Factory's Breaking Change Checker created alert issues. Given that the current CI doctor watches 27 workflows but doesn't specifically look for backward compatibility, this fills a real gap.

How: Trigger on PRs touching src/cli.ts, src/squid-config.ts, src/docker-manager.ts. Compare current CLI flags/signatures to the last tagged release.

Effort: Medium

P1.3 — Changeset Generator

What: Analyzes commits since the last tag and automatically generates a draft PR with a version bump (semver) and structured CHANGELOG entry when enough changes have accumulated or on a schedule.

Why: Currently update-release-notes runs after a release is published — it improves notes retroactively. A Changeset Generator would proactively propose the next release version + changelog before the release, giving maintainers a ready-to-merge PR. In the Pelis Factory this had a 78% merge rate (22/28 PRs).

How: Add via gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/changeset.md and customize for Node.js/npm versioning.

Effort: Low-Medium

P1.4 — CI Coach

What: Periodic analysis of CI pipeline efficiency: identifies slow jobs, redundant steps, jobs that always pass/fail together (candidates for merging), and opportunities for caching improvements.

Why: The repo currently has 30+ workflow files and the CI is clearly complex. The CI Coach in Pelis had a 100% merge rate (9/9). The current ci-cd-gaps-assessment looks at coverage gaps; a CI Coach would focus on speed and efficiency.

How: Add via gh aw add-wizard githubnext/agentics/ci-coach

Effort: Low

P2 — Consider for Roadmap

P2.1 — Audit Workflows (Meta-Analytics)

What: A daily/weekly meta-agent that aggregates cost, token usage, turn count, error rates, and success patterns across all agentic workflow runs. Produces a discussion with the agent ecosystem health dashboard.

Why: As this repo now has 21 workflows, visibility into which agents are producing value vs. burning tokens without results becomes important. The Pelis Factory's Audit Workflows created 93 discussions and drove 9 issues with downstream fixes.

Effort: Medium (requires workflow run log analysis)

P2.2 — Container Security Hardening Monitor

What: Weekly workflow that reads the current container configuration (seccomp profile, capabilities dropped, memory limits, network settings) and checks for regression against documented security baselines. Creates issues if any hardening has been weakened.

Why: This is domain-specific to the firewall's security guarantees. The existing security-guard catches changes in PRs, but a weekly audit catches drift from indirect changes or incorrect merges. Directly relevant to the security guarantees in README.

Effort: Low-Medium (mostly bash inspection + documentation comparison)

P2.3 — Code Simplifier

What: Daily agent that looks at recently modified TypeScript files and proposes simplifications: reduce nesting, extract repeated logic, use idiomatic TypeScript patterns, consolidate error handling.

Why: The codebase is growing (DNS-over-HTTPS just landed, api-proxy sidecar added, etc.). Maintaining simplicity prevents accumulation of technical debt. In the Pelis Factory this had an 83% merge rate.

How: Add via gh aw add-wizard https://github.com/github/gh-aw/blob/v0.45.5/.github/workflows/code-simplifier.md

Effort: Low

P2.4 — Grumpy Reviewer

What: An opinionated PR reviewer that focuses on code quality, naming conventions, error handling completeness, and TypeScript type safety — complementing the existing security-guard which focuses on security boundaries.

Why: security-guard is Claude-powered and security-focused. A separate quality reviewer would catch issues like missing error handling in new code paths, inconsistent naming, or functions that are too long. The githubnext/agentics repo has a grumpy-reviewer.md template.

Effort: Low — direct add from agentics repo

P2.5 — Weekly Issue Summary

What: A weekly digest discussion that summarizes the state of open issues: groups by category, highlights stale items, notes recently resolved issues, and flags issues that have been open longest without activity.

Why: With the Issue Monster actively working through the backlog, a weekly summary helps maintainers maintain situational awareness without reading every issue. Currently there are at least 10 open issues including longstanding ones (#950, #1039).

Effort: Low — available as githubnext/agentics/weekly-issue-summary

P3 — Future Ideas

P3.1 — Domain Allowlist Auditor

What: Periodically reviews all domain allowlists in tests, examples, and documentation to ensure they are minimal (principle of least privilege) and that no unnecessarily broad wildcards have crept in.

Why: Domain allowlist hygiene is core to the firewall's security model. *.com or *.io wildcards in examples could mislead users into thinking broad allowlists are acceptable.

Effort: Low

P3.2 — Accessibility Review for Docs Site

What: Tests the docs-site (Astro/Starlight) for accessibility issues on each deployment.

Why: The docs site is public-facing. The Pelis Factory's Daily Multi-Device Docs Tester (Playwright) had a 100% PR merge rate.

Effort: Medium (requires Playwright setup)

P3.3 — Issue Arborist

What: Links related issues as parent/child sub-issues automatically, building dependency trees across the backlog.

Why: As the repo accumulates issues, relationships (e.g., "all integration test gap issues") would benefit from grouping.

Effort: Low — add from gh-aw

📈 Maturity Assessment

Dimension	Score	Notes
Issue Management	3/5	Issue Monster + Duplication Detector are good; missing triage + arborist
CI/CD Automation	4/5	CI Doctor + CI/CD gaps assessment are strong; missing CI Coach
Security Automation	4.5/5	Secret diggers (3 engines), security-guard, daily-review are excellent; missing malicious code scan + escape tester
Documentation	4/5	Doc maintainer + CLI checker are solid; could add weekly summary
Meta-Observability	2/5	No Workflow Health Manager, no Metrics Collector, no Audit Workflows
Release Automation	3/5	Update-release-notes exists; missing Changeset Generator
Code Quality	2/5	Test coverage improver weekly; missing code simplifier, grumpy reviewer

Current Overall Level: 3.5/5 — "Advanced Practitioner" — significantly above average, with deep security automation but gaps in meta-observability and code quality agents.

Target Level: 4.5/5 — "Factory-Class" — achievable by adding the P0/P1 items above.

Gap to Close:

Add meta-observability layer (Workflow Health Manager, Audit Workflows)
Fill the issue lifecycle gaps (Triage, Arborist)
Add domain-specific firewall validation automation (Escape Test Agent)
Automate the release pipeline more fully (Changeset Generator)

🔄 Comparison with Best Practices

What This Repository Does Well

Multi-engine smoke testing: Running smoke tests on Claude, Codex, and Copilot simultaneously is a best practice rarely seen — even gh-aw doesn't do this systematically
Security-depth: 3 hourly secret diggers + daily security review + PR security guard is exceptional coverage
Cache-memory adoption: The issue-duplication-detector properly uses persistent state
Skip-if-match guards: doc-maintainer and test-coverage-improver both prevent flooding
CI Doctor coverage: Watching 27 workflows shows maturity in operational awareness

What Could Improve

No meta-observability: The Pelis Factory's most valuable insight was "agents that watch agents" — this repo lacks Audit Workflows and Workflow Health Manager entirely
Missing triage automation: The simplest and most universally recommended workflow (issue triage) is absent
Release process is reactive: update-release-notes fires after release; proactive changeset generation would be smoother
No code quality agents: Code Simplifier and Grumpy Reviewer are low-effort, high-value additions

Unique Opportunities Given the Firewall/Security Domain

This repository has a unique opportunity that gh-aw itself doesn't have: the product is a security firewall. This means:

Self-referential security testing: The firewall should test itself using itself (dogfooding in security validation)
Escape test automation: Daily adversarial testing is a natural fit — and currently absent as an agentic workflow
Domain allowlist hygiene: A domain-specific concern that other repos don't have
Container hardening regression monitoring: Security baselines should be continuously verified

📝 Notes for Future Runs

Stored in /tmp/gh-aw/cache-memory/advisor-notes.json

Last analyzed: 2026-03-13
21 agentic workflows found (up from initial setup)
Top priority gap: Firewall Escape Test Agent (P0) — most domain-unique opportunity
Recurring open issues: Integration test coverage gaps and recommended actions #1039 (integration test gaps), Missing crates.io/static.rust-lang.org in Python and Deno ecosystem allowlists #950 (ecosystem allowlists), [agentics] No-Op Runs #1274-1287 (workflow failures)
The multiple "failed" issues in the open issue list suggest the Workflow Health Manager (P1.1) has immediate practical value

AI generated by Pelis Agent Factory Advisor

expires on Mar 20, 2026, 3:25 AM UTC

2026-03-13T03:57:28Z

github-actions[bot]
bot Mar 13, 2026
Author

The veil parts and the oracle marks this place; the smoke test agent was here, and the omens are recorded.

🔮 The oracle has spoken through Smoke Codex for issue #1270

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations for gh-aw-firewall #1289

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Analysis & Recommendations for gh-aw-firewall #1289

Uh oh!

github-actions[bot] bot Mar 13, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns from the Documentation Site

Key Patterns from the githubnext/agentics Repository

How This Repo Compares

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1 — Firewall Escape Test Agent

P0.2 — Issue Triage Agent

P0.3 — Daily Malicious Code Scan

P1 — Plan for Near-Term

P1.1 — Workflow Health Manager (Meta-Agent)

P1.2 — Breaking Change Checker

P1.3 — Changeset Generator

P1.4 — CI Coach

P2 — Consider for Roadmap

P2.1 — Audit Workflows (Meta-Analytics)

P2.2 — Container Security Hardening Monitor

P2.3 — Code Simplifier

P2.4 — Grumpy Reviewer

P2.5 — Weekly Issue Summary

P3 — Future Ideas

P3.1 — Domain Allowlist Auditor

P3.2 — Accessibility Review for Docs Site

P3.3 — Issue Arborist

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repository Does Well

What Could Improve

Unique Opportunities Given the Firewall/Security Domain

📝 Notes for Future Runs

Replies: 1 comment

Uh oh!

github-actions[bot] bot Mar 13, 2026 Author

github-actions[bot]
bot Mar 13, 2026

Key Patterns from the `githubnext/agentics` Repository

github-actions[bot]
bot Mar 13, 2026
Author