PyGate

Python CI failures are noisy, tool-specific, and expensive to triage when Ruff, Pyright, and pytest all disagree about what matters first.

PyGate turns those gate results into one deterministic Python quality gate with bounded auto-repair and structured escalation evidence for humans or agents.

"The PR failed, but I have to dig through Ruff, Pyright, and pytest separately to understand why."
"We want fail-fast Python CI, not another lint dashboard."
"Auto-fix should stop when it stops helping instead of thrashing the repo."
"If repair cannot finish the job, I want a clean escalation artifact instead of a pile of logs."

pip install pygate-ci

pygate summarize --input demo/artifacts/failures.json

{
  "brief_json_path": ".pygate/agent-brief.json",
  "brief_md_path": ".pygate/agent-brief.md",
  "status": "fail"
}

When To Use It

Use PyGate when you want one deterministic Python CI gate that can normalize Ruff, Pyright, and pytest output, attempt bounded deterministic repair, and escalate with machine-readable evidence when it cannot finish safely.

When Not To Use It

Do not use PyGate as a generic lint aggregator, a semantic code fixer, or a replacement for the underlying tools. It is a fail-fast gate-and-escalate wrapper around them.

PyPI package: pygate-ci
CLI command: pygate

Quick Start

# Prerequisites: ruff, pyright, and pytest must be available in your environment
pip install pygate-ci

# Run quality gates on changed files
echo "src/app.py" > changed.txt
pygate run --mode canary --changed-files changed.txt

# Generate agent brief from failures
pygate summarize --input .pygate/failures.json

# Attempt bounded repair
pygate repair --input .pygate/failures.json --max-attempts 3

Note: The PyPI package is pygate-ci but the CLI command is pygate.

What It Does

PyGate runs deterministic quality gates on your Python project and produces structured, machine-readable artifacts designed for both humans and AI agents.

Gates

Gate	Tool	Canary	Full
lint	ruff	yes	yes
typecheck	pyright	yes	yes
test	pytest	configurable	yes

How It Works

Changed Files ──> Run Gates ──> Findings? ──No──> Pass
                                    |
                                   Yes
                                    |
                                    v
                              Repair Loop ──> Improved? ──Fixed──> Pass
                                                  |
                                                  No
                                                  |
                                                  v
                                        Escalate with Evidence

You tell PyGate which files changed (from your CI diff, PR, etc.)
It runs lint, typecheck, and optionally tests
Findings are normalized into a unified schema with severity, rule codes, and evidence
The repair loop applies safe deterministic fixes (ruff --fix + format)
If it can't fix everything, it escalates with structured evidence explaining why

Commands

pygate run --mode canary|full --changed-files <path>
pygate summarize --input .pygate/failures.json
pygate repair --input .pygate/failures.json [--max-attempts N]

Exit codes: 0 = pass, 1 = fail (run), 2 = escalated (repair)

Artifacts

All artifacts are written to .pygate/:

File	Description
`failures.json`	Structured findings with severity, rule codes, and evidence
`run-metadata.json`	Gate execution traces (commands, stdout, stderr, durations)
`agent-brief.json`	Priority actions and retry policy for AI agents
`agent-brief.md`	Human-readable summary
`repair-report.json`	Repair attempt history (on success)
`escalation.json`	Escalation reason and evidence (on failure)

JSON Schema files for all artifact types are available in schemas/ for downstream validation and code generation. See demo/artifacts/ for sample output.

Repair Loop

The repair command runs a bounded deterministic repair loop:

Backup workspace
Fix via ruff check --fix + ruff format on scoped files
Re-run gates to measure improvement
Decide: pass (done), worsened (rollback), no improvement (escalate)

Policy Defaults

Parameter	Default
Max attempts	3
Max patch lines	150
No-improvement abort	2 consecutive
Time cap	20 minutes

Escalation Codes

Code	Meaning
`NO_IMPROVEMENT`	2+ consecutive attempts with no finding reduction
`PATCH_BUDGET_EXCEEDED`	Edit exceeded line budget
`UNKNOWN_BLOCKER`	Max attempts exhausted
`UNRESOLVED_DETERMINISTIC_FAILURES`	Deterministic failures remain after repair
`ARCHITECTURAL_CHANGE_REQUIRED`	Structural issues beyond repair scope (reserved)
`FLAKY_EVALUATOR`	Gate produces inconsistent results (reserved)
`ENVIRONMENT_DRIFT`	Python version or dependency mismatch (reserved)
`TEST_FIXTURE_OR_EXTERNAL_DEP`	Tests depend on network, DB, or time (reserved)

Configuration

Configure via pygate.toml (standalone) or [tool.pygate] in pyproject.toml:

pygate.toml:

[policy]
max_attempts = 3
max_patch_lines = 150
abort_on_no_improvement = 2
time_cap_seconds = 1200

[commands]
lint = "ruff check --output-format json ."
typecheck = "pyright --outputjson ."
test = "pytest --json-report --json-report-file=.pygate/pytest-report.json -q"

[gates]
test_in_canary = false

Or in pyproject.toml:

[tool.pygate.policy]
max_attempts = 3

[tool.pygate.commands]
lint = "ruff check --output-format json ."

[tool.pygate.gates]
test_in_canary = false

GitHub Action

PyGate ships with a composite GitHub Action for CI integration:

- uses: actions/checkout@v4
- uses: hermes-labs-ai/quick-gate-python/.github/actions/pygate@main
  with:
    mode: canary          # or "full"
    repair: "true"        # attempt auto-repair on failures
    max-attempts: 3
    python-version: "3.12"

The action detects changed files from the PR, runs gates, optionally repairs, and uploads .pygate/ artifacts. The post-comment feature requires pull-requests: write permission in your workflow.

Limitations

Deterministic repair only (v1): The repair loop uses ruff --fix and ruff format. It cannot fix type errors, failing tests, or issues requiring semantic understanding.
No incremental analysis: All specified gates run on every invocation. There is no caching or incremental mode.
Tool availability: PyGate requires ruff, pyright, and pytest to be installed in the target environment. It does not install them.
Single-repo scope: Designed for single Python projects, not monorepos with multiple packages.

Roadmap

Model-assisted repair (LLM-powered fixes for type errors and test failures)
Coverage gate (fail on coverage drops)
Security gate (bandit / safety integration)
Incremental mode (only re-run gates on changed files)
PyPI trusted publishing via GitHub Actions
Plugin system for custom gates

Contributing

See CONTRIBUTING.md for development setup and guidelines.

License

Apache 2.0

About Hermes Labs

Hermes Labs builds AI audit infrastructure for enterprise AI systems — EU AI Act readiness, ISO 42001 evidence bundles, continuous compliance monitoring, agent-level risk testing. We work with teams shipping AI into regulated environments.

Our OSS philosophy — read this if you're deciding whether to depend on us:

Everything we release is free, forever. MIT or Apache-2.0. No "open core," no SaaS tier upsell, no paid version with the features you actually need. You can run this repo commercially, without talking to us.
We open-source our own infrastructure. The tools we release are what Hermes Labs uses internally — we don't publish demo code, we publish production code.
We sell audit work, not licenses. If you want an ANNEX-IV pack, an ISO 42001 evidence bundle, gap analysis against the EU AI Act, or agent-level red-teaming delivered as a report, that's at hermes-labs.ai. If you just want the code to run it yourself, it's right here.

The Hermes Labs OSS audit stack (public, production-grade, no SaaS):

Static audit (before deployment)

lintlang — Static linter for AI agent configs, tool descriptions, system prompts. pip install lintlang
rule-audit — Static prompt audit — contradictions, coverage gaps, priority ambiguities
scaffold-lint — Scaffold budget + technique stacking. pip install scaffold-lint
intent-verify — Repo intent verification + spec-drift checks

Runtime observability (while the agent runs)

little-canary — Prompt injection detection via sacrificial canary-model probes
suy-sideguy — Runtime policy guard — user-space enforcement + forensic reports
colony-probe — Prompt confidentiality audit — detects system-prompt reconstruction

Regression & scoring (to prove what changed)

hermes-jailbench — Jailbreak regression benchmark. pip install hermes-jailbench
agent-convergence-scorer — Score how similar N agent outputs are. pip install agent-convergence-scorer

Supporting infra

claude-router · zer0dex · forgetted · quick-gate-js · repo-audit

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
assets		assets
demo/artifacts		demo/artifacts
eval		eval
schemas		schemas
src/pygate		src/pygate
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
llms.txt		llms.txt
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyGate

Table of Contents

Quick Start

What It Does

Gates

How It Works

Commands

Artifacts

Repair Loop

Policy Defaults

Escalation Codes

Configuration

GitHub Action

Limitations

Roadmap

Contributing

License

About Hermes Labs

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PyGate

Table of Contents

Quick Start

What It Does

Gates

How It Works

Commands

Artifacts

Repair Loop

Policy Defaults

Escalation Codes

Configuration

GitHub Action

Limitations

Roadmap

Contributing

License

About Hermes Labs

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages