From 7358e8954f2cfe276ed5e02fa2a9b06b12a4747b Mon Sep 17 00:00:00 2001 From: Greg Allen Date: Fri, 3 Jul 2026 09:21:01 -0400 Subject: [PATCH] =?UTF-8?q?docs:=20ADR=200063=20=E2=80=94=20GitLab=20cron-?= =?UTF-8?q?polling=20event=20dispatch?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Claude Signed-off-by: Greg Allen --- README.md | 2 +- docs/ADRs/0028-gitlab-support.md | 8 +- ...0063-gitlab-cron-polling-event-dispatch.md | 427 ++++ docs/architecture.md | 2 + docs/glossary.md | 8 +- docs/normative/normalized-event/v1/README.md | 2 +- .../gitlab-cron-polling-implementation.md | 1728 +++++++++++++++++ docs/problems/gitlab-implementation.md | 10 +- docs/roadmap.md | 4 +- 9 files changed, 2179 insertions(+), 12 deletions(-) create mode 100644 docs/ADRs/0063-gitlab-cron-polling-event-dispatch.md create mode 100644 docs/plans/gitlab-cron-polling-implementation.md diff --git a/README.md b/README.md index cd3713b01..efab4f876 100644 --- a/README.md +++ b/README.md @@ -35,7 +35,7 @@ This is not a product spec. It's an evolving exploration of a hard problem space - [Performance Verification](docs/problems/performance-verification.md) — Catching agent-introduced performance regressions before they reach production - [Production Feedback](docs/problems/production-feedback.md) — How platform execution signals feed back into what agents work on and how they assess risk - [Testing the Agents](docs/problems/testing-agents.md) — CI for prompts: regression testing, eval frameworks, and behavioral verification for agent instructions - - [GitLab Implementation](docs/problems/gitlab-implementation.md) — Implementation details for GitLab support: webhook security, dispatch pipelines, forge interface evolution + - [GitLab Implementation](docs/problems/gitlab-implementation.md) — Implementation details for GitLab support: cron-polling event dispatch, pipeline scheduling, forge interface evolution - [Operational Observability](docs/problems/operational-observability.md) — How do the humans operating an autonomous software factory understand what it is doing, debug it when it goes wrong, and improve it over time? - [Adaptive Agent Selection](docs/problems/adaptive-agent-selection.md) — Learning which agent/team/workflow configurations work best for which problem classes, using evolutionary algorithms and Thompson Sampling - [Platform Nativeness](docs/problems/platform-nativeness.md) — When the platform you automate is also the one you build on: which problems are inherent vs. self-inflicted diff --git a/docs/ADRs/0028-gitlab-support.md b/docs/ADRs/0028-gitlab-support.md index cbefd24d7..9aec613b4 100644 --- a/docs/ADRs/0028-gitlab-support.md +++ b/docs/ADRs/0028-gitlab-support.md @@ -19,9 +19,11 @@ Date: 2026-04-29 Deprecated — the harness-level forge-specific vs. forge-neutral split is now addressed by [ADR 0045](0045-forge-portable-harness-schema.md) -(forge-portable harness schema). The broader GitLab support architecture -(CI/CD pipeline mapping, PAT-based auth, webhook bridging) documented -here remains reference material. +(forge-portable harness schema). The webhook bridge approach described +here is superseded by [ADR 0063](0063-gitlab-cron-polling-event-dispatch.md) +(cron-polling event dispatch), which eliminates webhooks entirely. The +broader GitLab support architecture (CI/CD pipeline mapping, PAT-based +auth) documented here remains reference material. ## Context diff --git a/docs/ADRs/0063-gitlab-cron-polling-event-dispatch.md b/docs/ADRs/0063-gitlab-cron-polling-event-dispatch.md new file mode 100644 index 000000000..7155e4c2f --- /dev/null +++ b/docs/ADRs/0063-gitlab-cron-polling-event-dispatch.md @@ -0,0 +1,427 @@ +--- +title: "63. GitLab cron-polling event dispatch" +status: Accepted +relates_to: + - agent-infrastructure + - gitlab-implementation + - security-threat-model +topics: + - gitlab + - forge + - ci-cd + - per-repo + - polling + - cron +--- + +# 63. GitLab cron-polling event dispatch + +Date: 2026-06-13 + +## Status + +Accepted + + + +## Context + +Fullsend needs to detect and react to GitLab events — new issues, merge +requests, comments, and label changes — so that agent stages (triage, code, +review, fix, retro) can be dispatched automatically. On GitHub, native event +triggers (`pull_request_target`, `issues`, `issue_comment`) handle this within +GitHub Actions. GitLab has no equivalent for most event types. + +GitLab's CI/CD pipeline trigger sources are: `push`, `merge_request_event`, +`schedule`, `trigger`, `web`, `api`, and `parent_pipeline`. Of these, only +`merge_request_event` maps to an agent-relevant event. Issue creation, comment +posting, and label changes have no native CI pipeline trigger. GitLab supports +per-repo installation mode only (no per-org); the pipeline runs inside the +enrolled project on the protected default branch. + +See [ADR 0028](0028-gitlab-support.md) for the original GitLab +support architecture discussion. ADR 0028 documented a webhook bridge approach; +this ADR supersedes that direction based on the operational complexity analysis +in Options 1–3 below. [ADR 0045](0045-forge-portable-harness-schema.md) +defines the forge-portable harness schema that GitLab stage templates must +conform to. + +## Options + +### Option 1: Webhook bridge Cloud Function + +Deploy a GCP Cloud Function that receives GitLab webhook POST requests, +validates the `X-Gitlab-Token` header, and calls the Pipeline Trigger API to +dispatch agent stages. + +**Rejected.** Requires external infrastructure (Cloud Function) that must be +deployed, monitored, and secured. Exposes a public HTTPS endpoint — an inbound +attack surface. Requires three credential types per project (bot PAT, webhook +secret, trigger token). Creates a complex deployment story for self-hosted +GitLab behind corporate firewalls (VPN peering, on-premise containers, or +Cloud Run + VPC Connector). The bridge cannot be eliminated even in a hybrid +model — if any event type uses webhooks, the full bridge must be deployed. + +### Option 2: Webhook-only (all events via bridge) + +Use the webhook bridge for all events, eliminating native CI triggers. + +**Rejected.** Still requires the bridge with all its operational complexity. +The correct response to "if we need webhooks for some events, why not all?" is +to eliminate the bridge entirely, not to double down on it. + +### Option 3: Native merge request (MR) events + webhook bridge for issues/comments + +Use GitLab's native `merge_request_event` for MR events, keep the webhook +bridge only for issues and comments. + +**Rejected.** Still requires the bridge Cloud Function. The bridge's +operational cost is dominated by deployment, monitoring, and credential +management — not by event type count. + +### Option 4: Pure cron polling (no native CI triggers) + +Poll for all events including MR creation and updates. + +**Rejected.** MR events have a viable native CI path (`merge_request_event` + +`include: local:`) with sub-minute latency and zero additional infrastructure. +Polling for MRs adds unnecessary latency to the most frequent, most +latency-sensitive operation (code review). + +## Decision + +GitLab event dispatch uses a **two-path model**: + +1. **Native CI triggers for MR events.** MR creation, update, reopen, and + merge trigger pipelines via GitLab's `merge_request_event` pipeline source. + The dispatch template is loaded via `include: local:` from the protected + default branch, ensuring untrusted MR branches cannot modify dispatch logic. + +2. **Cron-polled events for everything else.** A scheduled pipeline runs every + N minutes (5 minutes on Premium/Ultimate, 60 minutes on Free tier), queries + the GitLab API for new issues, comments, and label changes since the last + poll, and dispatches agent stages via parent-child pipelines. + +No external infrastructure is required for event dispatch — no webhook bridge, +no webhook secrets, no trigger tokens. + +``` +ENROLLED PROJECT GCP (optional, WIF mode only) +──────────────── ──── +.gitlab-ci.yml (root pipeline) WIF pool/provider (validates GitLab OIDC) +.gitlab/ci/fullsend-dispatch.yml (MR routing) Service Account (impersonated by jobs) +.gitlab/ci/fullsend-poll.yml (cron poller) Secret Manager: +.gitlab/ci/fullsend-triage.yml … retro.yml - bot PAT per enrolled project +.fullsend/ (config workspace) + +MR events (native CI): + MR opened/updated → merge_request_event → fullsend-dispatch.yml → review/fix stage + +Issues, comments, labels (cron): + Pipeline schedule (5 min) → fullsend-poll.yml → GitLab API → dispatch agent stage + +Credentials (WIF mode): + Pipeline job → OIDC token → GCP STS → WIF → SA → Secret Manager → bot PAT + +Credentials (variable mode): + Pipeline job → protected CI/CD variable FULLSEND_FORGE_TOKEN → bot PAT +``` + +### Credential model + +A Developer-role project access token with `api` scope, created during +`fullsend admin install`. Two storage modes are supported: + +**Mode 1: OIDC/WIF (recommended).** The bot PAT is stored in GCP Secret +Manager and retrieved at runtime via GitLab OIDC → GCP WIF. No secrets +are stored as CI/CD variables. This is the recommended mode when GCP +infrastructure is available (e.g., projects already using Vertex AI for +inference). + +**Mode 2: Protected CI/CD variable (fallback).** The bot PAT is stored +as a protected, masked CI/CD variable (`FULLSEND_FORGE_TOKEN`). No GCP +infrastructure required. This is the default mode for environments +without GCP access, including self-hosted GitLab instances with no cloud +dependency. + +The install flow selects the mode automatically: if `--gcp-project` is +provided, OIDC/WIF is configured; otherwise, the CI/CD variable path is +used. A `FULLSEND_CREDENTIAL_MODE` protected variable (`wif` or +`variable`) tells pipeline templates which retrieval path to execute. + +Key properties shared by both modes: + +- **Single credential type.** One bot PAT per project handles all REST and + GraphQL operations. No webhook secrets, trigger tokens, or mint service. +- **Bot identity.** The project access token creates a dedicated bot user, + providing attributable identity equivalent to GitHub Apps. +- **GraphQL support.** Unlike `CI_JOB_TOKEN`, the bot PAT authenticates + GraphQL — required for GitLab's Work Items API. + +OIDC/WIF mode additionally provides: + +- **`CI_DEBUG_TRACE` defense-in-depth.** GitLab logs all CI/CD variables + at job initialization, *before* any script runs. In variable mode, a + Maintainer enabling `CI_DEBUG_TRACE` exposes the PAT in job logs before + the script-level guard can abort. In WIF mode, WIF configuration + metadata (pool IDs, project numbers, service account emails) is logged + but the PAT itself is not — it is retrieved later by `gcloud`, after the + guard has already run. The metadata exposure is an accepted tradeoff: + it reveals infrastructure topology but not credentials. This is the + primary security difference between the two modes. +- **Cryptographic access control.** WIF attribute conditions restrict + token retrieval to the enrolled project on protected branches + (`assertion.project_id` + `assertion.ref_protected == "true"`). +- **Separation of administrative domains.** WIF configuration lives in + GCP IAM, outside the GitLab Maintainer's control. A GitLab Maintainer + cannot modify WIF attribute conditions without GCP IAM access. +- **No token mint.** Standard GCP WIF replaces the custom mint Cloud + Function used for GitHub. + +### Cron poller + +The poller runs as `fullsend poll` inside the fullsend container image, +invoked by a scheduled pipeline on the protected default branch. It reads a +timestamp watermark, queries the GitLab API for events since the last poll, +routes them to agent stages, dispatches via child pipeline YAML, and advances +the watermark. See the [companion implementation plan](../plans/gitlab-cron-polling-implementation.md) +for detailed pseudocode and numbered steps. + +Label change detection uses client-side state diffing — the poller tracks +previously-seen labels per issue and triggers only on newly-added labels. This +compensates for the lack of a `changes` object that webhook payloads provide. + +**Multi-frequency polling (Premium/Ultimate):** Two pipeline schedules — a +fast poll (every 5 minutes, slash commands only) and a slow poll (every 15 +minutes, full event scan). On Free tier, a single hourly poll is the only +option. + +### Event routing + +The design goal is **functional event-type parity with GitHub** — users see the +same labels, slash commands, and stage dispatches regardless of forge (latency +differs: cron-polled events have 5–60 minute delay vs sub-second on GitHub). The table below +documents how each event maps to the two-path transport model (native CI vs +cron polling), not a new event specification. + +| Detected Change | Transport | Stage | +|---|---|---| +| Issue label `ready-to-code` added | Cron poll (label state diff) | code | +| Issue label `ready-for-review` added | Cron poll (label state diff) | review | +| Issue note starting with `/fs-{triage,code,review,fix,retro,prioritize}` | Cron poll (note body prefix) | corresponding stage | +| Issue note (non-command) on issue with `needs-info` label | Cron poll (label check) | triage | +| MR opened/updated/reopened | Native CI (`merge_request_event`) | review | +| MR merged | Native CI (`merge_request_event`) | retro | +| MR note with `` | Cron poll (note body marker) | fix (same-project MRs only) | + +Bot-authored comments are skipped to prevent re-triggering loops (exception: +the `changes-requested` marker from the review agent). + +### Slash command latency + +Slash commands (`/fs-*`) are the only latency-sensitive operation. Mitigations: + +- **Labels as primary triggers.** Applying `ready-for-review` or + `ready-to-code` labels is discoverable and visible. Labels on issues are + detected via cron poll (5–60 minute latency); labels on MRs can also be + detected via native CI `merge_request_event` when applied alongside an + MR update. +- **Multi-frequency polling** keeps slash command latency to 5 minutes on + Premium/Ultimate. +- **Manual pipeline trigger** via the GitLab UI as a power-user escape hatch. + +**MR note limitation (fast-poll):** `/fs-fix` and `/fs-code` commands on MR +notes are only acted upon during the full-poll cycle (every 15 minutes on +Premium/Ultimate), not the fast poll. The fast-poll path does not fetch MR +source/target project IDs, so the fork MR protection check (deny-by-default +when unknown) blocks these stages. This adds up to 10 minutes of latency +beyond the fast-poll interval. Fetching MR details per note in fast-poll +would add API calls that defeat its lightweight purpose. In practice, fix +stages are typically triggered by the review bot's `changes-requested` +marker (which uses the full-poll path), not human slash commands. + +**Quick Action risk:** GitLab may silently strip unrecognized `/`-prefixed +lines. If confirmed empirically, GitLab should use an alternative prefix +(`fs:triage` or `@fullsend triage`). [ADR 0042](0042-fs-prefix-for-slash-commands.md) +permits forge-specific syntax. + +### GitLab tier considerations + +| Feature | Free | Premium | Ultimate | +|---|---|---|---| +| Schedule minimum interval | 60 min | 5 min | 5 min | +| Project access tokens (SaaS) | Not available | Available | Available | +| CODEOWNERS enforcement | Not available | Available | Available | +| CI minutes (shared runners) | 400/month | 10,000/month | 50,000/month | + +**Free tier** is functional but degraded: 60-minute poll interval, no project +access tokens on gitlab.com (must use personal access token), no CODEOWNERS +guardrails, and CI minute quota is insufficient for polling on shared runners. +Self-hosted runners are required. As an alternative, Free tier users can run +`fullsend poll` on an external scheduler (cron on a VM, Kubernetes CronJob, +etc.) at any desired interval. This reintroduces external infrastructure but +is architecturally simpler than a webhook bridge — the poller is entirely +outbound (no public endpoint, no inbound payload parsing) and uses the same +code path as the in-CI poller. + +**Premium** (recommended minimum): 5-minute polling, project access tokens, +CODEOWNERS enforcement, adequate CI minutes for a single project. + +`fullsend admin install` adapts poll frequency and interaction model to the +detected tier. + +### Security model + +The security model follows the project's threat priority order (external +injection > insider > drift > supply chain): + +- **No inbound attack surface.** Polling is entirely outbound — no public + endpoint, no webhook parser, no shared-secret authentication. +- **Protected branch enforcement.** `workflow:rules` require + `$CI_COMMIT_REF_PROTECTED == "true"` for scheduled pipelines. +- **Protected CI/CD variables.** All fullsend CI/CD variables are marked + protected — accessible only to pipelines on protected branches. +- **`CI_DEBUG_TRACE` guard.** Install-time validation and runtime abort if + debug tracing is detected. In **variable mode**, this guard is the sole + defense against PAT exposure via debug tracing — GitLab logs CI/CD + variables at job init, before any script runs. In **WIF mode**, the guard + is defense-in-depth — even if bypassed, the PAT is not in a CI/CD + variable and is retrieved after the guard runs. **Known limitation:** + install-time validation checks project-level and group-level variables but + cannot query instance-level CI/CD variables (requires admin API access). + On self-hosted GitLab instances where instance admins are outside the + trusted team, WIF mode is recommended. +- **Event data sanitization.** Attacker-controlled content is base64-encoded + before passing to child pipelines. +- **Fork MR protection.** Fix/code stages are skipped when + `source_project_id != target_project_id`. +- **Slash command authorization.** Only users with Developer-level (30+) + project access can trigger agent stages via `/fs-*` commands. + Exception: non-command comments on issues with the `needs-info` label + trigger triage without slash command authorization (any commenter). + +**Security comparison of credential modes:** + +| Threat vector | WIF mode | Variable mode | +|---|---|---| +| `CI_DEBUG_TRACE` by Maintainer | PAT not exposed (defense-in-depth) | PAT exposed at job init before script guard runs (guard limits further damage but cannot prevent initial exposure) | +| Maintainer marks branch as protected | WIF grants token (same risk) | Variable exposed (same risk) | +| GitLab database compromise | PAT not in GitLab (in Secret Manager) | PAT stored in GitLab | +| Admin domain separation | WIF config requires GCP IAM | All within GitLab RBAC | +| Audit trail | GCP Data Access logs | GitLab audit logs (Premium+) | + +WIF mode is recommended for projects where the Maintainer pool extends +beyond trusted team members, or where compliance requires external +secret storage. + +### Forge abstraction + +[ADR 0005](0005-forge-abstraction-layer.md) requires new forges to implement +`forge.Client`. This ADR extends the forge interface with new methods (some GitLab-specific, some forge-neutral): + +- `IsProtectedBranch` — maps to GitHub branch protection API and GitLab + protected branches API +- `CreatePipelineSchedule` / `DeletePipelineSchedule` — GitLab-native; GitHub + returns `ErrNotSupported` +- `UpdateVariable` — for poll watermark management + +A new `ErrNotSupported` sentinel (complementing the existing forge +sentinel errors) allows forge +implementations to reject inapplicable operations. GitHub-only methods +(`ListOrgInstallations`, `GetAppClientID`) move to a `GitHubExtensions` +extension interface. This requires interface evolution beyond pure +implementation — adding methods to `forge.Client` and refactoring +GitHub-specific methods into an extension interface. This is anticipated +growth of the abstraction boundary, not a violation of +[ADR 0005](0005-forge-abstraction-layer.md)'s design; the changes to +`appsetup.go` and `admin.go` are limited to calling new forge-neutral +methods rather than adding forge-conditional logic. + +## Consequences + +**What becomes easier:** + +- **No external infrastructure for event dispatch.** No Cloud Function, no + webhook bridge. Self-hosted GitLab requires only outbound HTTPS. +- **Single credential per project.** One bot PAT, stored in either GCP + Secret Manager (WIF mode) or as a protected CI/CD variable (variable + mode). No webhook secrets, trigger tokens, or mint service changes. +- **Stronger event authenticity.** Events read directly from the GitLab API, + not from potentially spoofed webhook payloads. +- **No event loss.** Polling reads from the source of truth. Webhooks can fail + silently or auto-disable after 4 consecutive failures. +- **Simpler emergency shutdown.** Disable the pipeline schedule or revoke the + bot PAT. No bridge to tear down. +- **MR review latency is unaffected.** Native `merge_request_event` provides + sub-second triggering for the highest-frequency operation. +- **Tier-adaptive.** Works on all GitLab tiers with graceful degradation. +- **No GCP requirement.** Variable mode allows deployment on self-hosted + GitLab with no cloud dependency. WIF mode reuses GCP infrastructure + already provisioned for Vertex AI inference. + +**What becomes harder or changes:** + +- **Issue/comment event latency.** Up to 5 minutes on Premium, 60 minutes on + Free. Acceptable for asynchronous agent operations, poor for interactive use + on Free tier. +- **CI minute consumption.** Polling runs continuously. At 5-minute intervals: + ~8,640 min/month on shared runners. Self-hosted runners are not billed. +- **State management.** The poller must track watermarks, deduplicate events + across overlapping windows, and diff label state. This state is internal + to the GitLab forge implementation and does not leak into the + `forge.Client` interface, preserving the forge-neutral contract from + [ADR 0005](0005-forge-abstraction-layer.md). +- **Slash command latency.** Up to 5 minutes vs sub-second with webhooks. + Labels mitigate this for common operations. +- **Quick Action stripping.** GitLab may strip `/fs-*` commands from comments. + Requires testing and potentially alternative syntax. +- **Per-repo only.** No centralized config or credential management across + projects. +- **`api` scope is broad.** Narrower scopes are not available in GitLab today. + +**Risks** (ordered by threat priority): + +1. **YAML injection in child pipeline generation.** Attacker-controlled + issue/MR content could break child pipeline YAML syntax. Mitigated by + base64 encoding of event payloads passed to child pipelines. +2. **Prompt injection via polled events.** Attacker-controlled issue/MR + content reaches the agent at inference time. This risk is identical + across all forges and is handled by the existing agent harness security + layer, not by the transport mechanism. +3. **Watermark tampering.** A Maintainer could skip or replay events by + modifying the watermark variables. Mitigated by protected variable status + and event deduplication. +4. **Schedule modification.** A Maintainer could retarget the schedule to a + non-protected branch. In WIF mode, mitigated by WIF attribute conditions + rejecting credential retrieval. In variable mode, mitigated by protected + variable status (not exposed on non-protected branches). +5. **Missed events from API quirks.** The Notes API lacks `created_after`; the + Events API `after` parameter is date-only. Mitigated by 30-second watermark + overlap and dual-frequency polling as reconciliation. + +**Comparison with GitHub:** + +| Concern | GitHub | GitLab (this ADR) | +|---|---|---| +| Primary credential | App installation token via mint | Bot PAT (WIF or CI/CD variable) | +| MR/PR event dispatch | `pull_request_target` | `merge_request_event` | +| Issue/comment dispatch | Native events (sub-second) | Cron polling (5 min) | +| External infrastructure | Mint Cloud Function | None for event dispatch | +| Credential types | App key + installation token | Single bot PAT | + +Detailed implementation guidance — including poller pseudocode, forge interface +changes, CI/CD template scaffolding, and install flow — is in the companion +document: [Implementation plan: GitLab cron-polling](../plans/gitlab-cron-polling-implementation.md). + +## References + +- [ADR 0002](0002-initial-fullsend-design.md) — initial fullsend design (webhook + dispatch service, label state machine) +- [ADR 0033](0033-per-repo-installation-mode.md) — per-repo installation model (the only supported mode for GitLab) +- [ADR 0054](0054-require-authorization-on-all-agent-dispatch-paths.md) — authorization on all dispatch paths (slash command ACL) +- [ADR 0061](0061-harness-cel-dispatch.md) — harness CEL dispatch and NormalizedEvent schema +- [Implementation plan: GitLab cron-polling](../plans/gitlab-cron-polling-implementation.md) diff --git a/docs/architecture.md b/docs/architecture.md index 26c3d2874..a167313d1 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -47,6 +47,7 @@ Infrastructure platform choice and configuration are specified in the adopting o - Event-driven stage dispatch: eliminate `workflow_dispatch` + `gh workflow run` fan-out from `dispatch.yml` in favor of synchronous `workflow_call` so the dispatched run stays linked to the caller ([ADR 0041](ADRs/0041-synchronous-workflow-call-event-dispatch.md)). - Multi-repo management: a `fullsend repos` subcommand group with a declarative `repos.yaml` manifest for managing per-repo installations at scale — bulk install, status, sync, upgrade, and removal across repos and orgs ([ADR 0057](ADRs/0057-repos-management.md)). - Dispatch version-skew resolution: per-repo `reusable-dispatch.yml` inlines stage workflow jobs directly, eliminating `@v0` references to `reusable-{stage}.yml` ([ADR 0062](ADRs/0062-dispatch-version-skew.md)). +- GitLab event dispatch: two-path model — native CI triggers (`merge_request_event`) for MR events, cron-based polling for issues/comments/labels. No external infrastructure (no webhook bridge). Bot PAT via OIDC/WIF from Secret Manager or protected CI/CD variable. Per-repo only ([ADR 0063](ADRs/0063-gitlab-cron-polling-event-dispatch.md)). **Open questions:** @@ -172,6 +173,7 @@ The existing design principle is that [the repo is the coordinator](problems/age evaluated by `fullsend dispatch` with pluggable input/output drivers operating on a `NormalizedEvent` struct ([ADR 0061](ADRs/0061-harness-cel-dispatch.md)). +- GitLab dispatch uses cron-polled scheduled pipelines for issue/comment/label events and native `merge_request_event` for MR events. No webhook bridge required (see [ADR 0063](ADRs/0063-gitlab-cron-polling-event-dispatch.md)). **Open questions:** diff --git a/docs/glossary.md b/docs/glossary.md index b9d96110c..fd6c6d928 100644 --- a/docs/glossary.md +++ b/docs/glossary.md @@ -39,14 +39,14 @@ See [security-threat-model.md](problems/security-threat-model.md) and [architect ### Debouncing -Collapsing rapid-fire events on the same issue or PR into a single agent invocation. Without debouncing, a burst of edits to an issue body could trigger multiple redundant triage runs. The [webhook + dispatch service](ADRs/0002-initial-fullsend-design.md#1-webhook--dispatch-service) is responsible for deduplicating flapping events before dispatching work to agents. +Collapsing rapid-fire events on the same issue or PR into a single agent invocation. Without debouncing, a burst of edits to an issue body could trigger multiple redundant triage runs. The [webhook + dispatch service](ADRs/0002-initial-fullsend-design.md#1-webhook--dispatch-service) (and its GitLab equivalent, the cron poller) is responsible for deduplicating flapping events before dispatching work to agents. On GitHub this uses webhooks; on GitLab it uses cron-polled scheduled pipelines (see [ADR 0063](ADRs/0063-gitlab-cron-polling-event-dispatch.md)). See [architecture.md](architecture.md) (building block 1). ## E ### Entry Point -The single deterministic component that receives GitHub events (webhooks) and decides which agent combination to run. Previously called **wrapper** — the rename was adopted to avoid confusion with the sandbox/wrapping layer (see [#101](https://github.com/fullsend-ai/fullsend/issues/101) for the terminology evolution). The entry point is non-AI: it is a conventional program (currently Go) that parses events, enforces ACLs on slash commands, validates label transitions, and dispatches to agent runtimes. It does not make LLM calls. +The single deterministic component that receives forge events and decides which agent combination to run. On GitHub, events arrive via webhooks; on GitLab, via cron-polled scheduled pipelines (see [ADR 0063](ADRs/0063-gitlab-cron-polling-event-dispatch.md)). Previously called **wrapper** — the rename was adopted to avoid confusion with the sandbox/wrapping layer (see [#101](https://github.com/fullsend-ai/fullsend/issues/101) for the terminology evolution). The entry point is non-AI: it is a conventional program (currently Go) that parses events, enforces ACLs on slash commands, validates label transitions, and dispatches to agent runtimes. It does not make LLM calls. See [ADR 0002](ADRs/0002-initial-fullsend-design.md) building block 1 and [#101](https://github.com/fullsend-ai/fullsend/issues/101). ### Escalation @@ -159,7 +159,7 @@ See [ADR 0002](ADRs/0002-initial-fullsend-design.md) building block 2. ### Trigger -What initiates an agent run. Could be a GitHub event (issue filed, label applied, comment posted, PR opened, check completed), a [slash command](#slash-command), or a scheduled action. The term is used loosely in discussions — sometimes meaning the raw GitHub webhook event, sometimes meaning the processed signal that actually starts an agent after debouncing and validation. In fullsend's architecture, triggers flow through the [entry point](#entry-point), which normalizes and dispatches them. +What initiates an agent run. Could be a forge event (issue filed, label applied, comment posted, PR/MR opened, check completed), a [slash command](#slash-command), or a scheduled action. The term is used loosely in discussions — sometimes meaning the raw forge event (GitHub webhook or GitLab cron-polled change), sometimes meaning the processed signal that actually starts an agent after debouncing and validation. In fullsend's architecture, triggers flow through the [entry point](#entry-point), which normalizes and dispatches them. See [architecture.md](architecture.md) (building block 1). ### Triage @@ -182,7 +182,7 @@ See [security-threat-model.md](problems/security-threat-model.md) and [agent-arc ### Work Coordinator -The mechanism that assigns work to agents and prevents conflicts. The existing design principle is that the **repo is the coordinator** — branch protection, CODEOWNERS, status checks, and GitHub events provide coordination without a central orchestrator. The work coordinator may be just the glue connecting GitHub webhooks to agent infrastructure, or it may need to be more (e.g., a claim/lock system to prevent two code agents from picking up the same issue). +The mechanism that assigns work to agents and prevents conflicts. The existing design principle is that the **repo is the coordinator** — branch protection, CODEOWNERS, status checks, and forge events provide coordination without a central orchestrator. The work coordinator may be just the glue connecting forge events to agent infrastructure, or it may need to be more (e.g., a claim/lock system to prevent two code agents from picking up the same issue). See [architecture.md](architecture.md) and [#77](https://github.com/fullsend-ai/fullsend/issues/77). ## Z diff --git a/docs/normative/normalized-event/v1/README.md b/docs/normative/normalized-event/v1/README.md index f7c29cfc6..6e86cd2ef 100644 --- a/docs/normative/normalized-event/v1/README.md +++ b/docs/normative/normalized-event/v1/README.md @@ -259,7 +259,7 @@ Example: **GitLab** ([gitlab-implementation.md](../../../problems/gitlab-impleme | Concern | Illustrative mapping (future) | |---------|-------------------------------| -| Input driver | `gitlab-event` from GitLab webhook payload | +| Input driver | `gitlab-event` from GitLab CI event payload (cron-polled or `merge_request_event`; see [ADR 0063](../../../ADRs/0063-gitlab-cron-polling-event-dispatch.md)) | | `source.system` | `gitlab` (new enum value) | | `repo` slug | Nested group path (`group/subgroup/project`) — requires wider `repo_path` pattern | | MR events | `merge_request_event` → `entity.kind: change_proposal` | diff --git a/docs/plans/gitlab-cron-polling-implementation.md b/docs/plans/gitlab-cron-polling-implementation.md new file mode 100644 index 000000000..eda71a98f --- /dev/null +++ b/docs/plans/gitlab-cron-polling-implementation.md @@ -0,0 +1,1728 @@ +# Implementation Plan: GitLab Cron-Polling Event Dispatch + +**Context:** [ADR 0063](../ADRs/0063-gitlab-cron-polling-event-dispatch.md) decides a two-path event dispatch model for GitLab — native CI for merge request events, cron-polled scheduled pipelines for issues/comments/labels. This document contains the implementation plan and pseudocode for the cron-polling subsystem. + +## Table of Contents + +1. [Dependency Graph](#dependency-graph) +2. [Phase 0: Forge Interface Preparation](#phase-0-forge-interface-preparation) +3. [Phase 1: GitLab Forge Client](#phase-1-gitlab-forge-client) +4. [Phase 2: Cron Poller](#phase-2-cron-poller) +5. [Phase 3: GitLab CI/CD Templates](#phase-3-gitlab-cicd-templates) +6. [Phase 4: CLI Changes](#phase-4-cli-changes) +7. [Phase 5: Integration and Testing](#phase-5-integration-and-testing) +8. [Security-Critical Code Paths](#security-critical-code-paths) +9. [Verification Checklist](#verification-checklist) + +## Dependency Graph + +``` +Phase 0 (forge interface) ──┬──> Phase 1 (GitLab forge client) ──> Phase 4 (CLI changes) ──┐ + │ │ + └──> Phase 2 (cron poller) ────────────────────────────────────>├──> Phase 5 + │ +Phase 3 (CI/CD templates) ─────────────────────────────────────────────────────────────────>─┘ +``` + +Phases 1 and 2 depend on Phase 0 (forge interface changes). Phase 3 (CI/CD templates) has no code dependency on Phase 0 and can start immediately. Phase 4 depends on Phase 1. Phase 5 depends on all prior phases. + +## Phase 0: Forge Interface Preparation + +**Goal**: Prepare `forge.Client` for multi-forge support without breaking GitHub. Pure refactoring — no behavioral changes. + +### New methods on `forge.Client` + +Add to `internal/forge/forge.go`: + +```go +IsProtectedBranch(ctx context.Context, owner, repo, branch string) (bool, error) +CreatePipelineSchedule(ctx context.Context, owner, repo, ref, description, cron string, variables map[string]string) (scheduleID string, err error) +DeletePipelineSchedule(ctx context.Context, owner, repo, scheduleID string) error +ListPipelineSchedules(ctx context.Context, owner, repo string) ([]PipelineSchedule, error) +UpdateVariable(ctx context.Context, owner, repo, key, value string) error +CreateProtectedVariable(ctx context.Context, owner, repo, key, value string) error +``` + +These methods are forge-neutral by design. `IsProtectedBranch` maps to GitHub's branch protection API and GitLab's protected branches API. `CreatePipelineSchedule` and `DeletePipelineSchedule` are GitLab-native; the GitHub implementation returns `ErrNotSupported`. `UpdateVariable` maps to GitLab's CI/CD variable API. `CreateProtectedVariable` creates a CI/CD variable with `Protected: true, Masked: false` — used for poll state variables (watermark, label state) that must not be accessible on non-protected branches but whose values are not secrets. + +### New sentinel error + +```go +var ErrNotSupported = errors.New("operation not supported by this forge") +``` + +This complements the existing sentinel errors in `forge.go`. + +GitHub returns `ErrNotSupported` for `CreatePipelineSchedule`, `DeletePipelineSchedule`. GitLab returns it for `DispatchWorkflow`, `ListOrgInstallations`, `GetAppClientID`, and org-level secret/variable methods. + +**Decision rule**: Use extension interfaces (`GitHubExtensions`) for methods that conceptually do not exist on the other platform (e.g., `ListOrgInstallations`, `GetAppClientID` — GitHub App concepts with no GitLab analogue). Use `ErrNotSupported` for methods with a forge-neutral contract that one forge does not implement yet (e.g., `CreatePipelineSchedule` on GitHub). Callers of extension-interface methods use a type-assertion gate; callers of `ErrNotSupported` methods handle the error per call site. + +**Caller handling**: Audit all call sites via `grep -rn 'MethodName' internal/` to build a call-site inventory. Expected handling per call site: +- `DispatchWorkflow` callers (enrollment layer, `internal/layers/enrollment.go` `Install` via `dispatchRepoMaintenanceWithRetry` and `Uninstall`): repo-maintenance dispatch after enrollment/unenrollment. Skip with a log warning on `ErrNotSupported` — GitLab per-repo installs do not use cross-repo repo-maintenance workflows; enrollment changes are applied directly +- `DispatchWorkflow` callers (CLI, `internal/cli/admin.go`): repo-maintenance dispatch after enrollment config changes. Skip with a log warning on `ErrNotSupported` — same rationale as enrollment layer +- `CreateOrgSecret`/`OrgSecretExists` callers (dispatch layer, `internal/layers/dispatch.go`; CLI, `internal/cli/github.go`): skip with a log warning when `ErrNotSupported` — per-repo GitLab does not use org-level secrets +- `ListOrgInstallations`/`GetAppClientID` callers (appsetup, CLI): already gated behind `GitHubExtensions` type-assertion, so `ErrNotSupported` is never reached +- `GetLatestWorkflowRun`/`ListWorkflowRuns` callers: skip with a log warning — GitLab uses pipeline status via different mechanisms + +### Extension interface + +Move GitHub-only methods to a `GitHubExtensions` interface: + +```go +type GitHubExtensions interface { + ListOrgInstallations(ctx context.Context, org string) ([]Installation, error) + GetAppClientID(ctx context.Context, slug string) (string, error) +} +``` + +Callers type-assert to access these methods. This keeps the core `forge.Client` interface forge-neutral. + +### Forge detection + +New file `internal/forge/detect.go`: + +```go +func DetectForge(remoteURL string) (string, error) { + u, err := url.Parse(remoteURL) + if err != nil { + return "", fmt.Errorf("invalid remote URL: %w", err) + } + host := strings.ToLower(u.Hostname()) + + switch host { + case "github.com": + return "github", nil + case "gitlab.com": + return "gitlab", nil + default: + return "", fmt.Errorf("unknown forge host %q: use --forge flag for self-hosted instances", host) + } +} +``` + +### Files + +| Action | Path | +|--------|------| +| Modify | `internal/forge/forge.go` — add methods, sentinel, extension interface | +| Modify | `internal/forge/github/github.go` — implement new methods (schedule → `ErrNotSupported`; `IsProtectedBranch` → branch protection API); move `ListOrgInstallations`/`GetAppClientID` to `GitHubExtensions` | +| Modify | `internal/forge/fake.go` — implement new methods on FakeClient | +| Modify | `internal/appsetup/appsetup.go` — update `ListOrgInstallations`/`GetAppClientID` calls to use `GitHubExtensions` type-assertion | +| Modify | `internal/cli/admin.go` — update `ListOrgInstallations` calls to use `GitHubExtensions` type-assertion | +| Modify | `internal/cli/github.go` — update `GetAppClientID` calls to use `GitHubExtensions` type-assertion | +| Create | `internal/forge/detect.go` | +| Create | `internal/forge/detect_test.go` | + +### Verification + +`make go-test && make go-vet` — all existing tests pass unchanged. + +## Phase 1: GitLab Forge Client + +**Goal**: Implement `internal/forge/gitlab/gitlab.go` with the full `forge.Client` interface. + +### Constructor + +```go +func New(token string, opts ...Option) (*LiveClient, error) +``` + +Single-token constructor for the bot project access token. The token is used for all REST and GraphQL API calls. Options include `WithBaseURL(url)` for self-hosted instances (default: `https://gitlab.com`). + +### Full method mapping + +| `forge.Client` method | GitLab SDK / API | Notes | +|---|---|---| +| `GetRepo` | `Projects.GetProject` | Returns project metadata | +| `GetDefaultBranch` | `Projects.GetProject` → `DefaultBranch` | | +| `GetCommit` | `Commits.GetCommit` | | +| `ListCommits` | `Commits.ListCommits` | | +| `CreateBranch` | `Branches.CreateBranch` | | +| `DeleteBranch` | `Branches.DeleteBranch` | | +| `GetBranchRef` | `Branches.GetBranch` | Returns HEAD commit SHA | +| `GetFileContent` | `RepositoryFiles.GetFile` | Base64 decode content | +| `ListFiles` | `Repositories.ListTree` | Recursive via `Recursive: true` | +| `CreateOrUpdateFile` | `RepositoryFiles.CreateFile` / `UpdateFile` | Check existence first | +| `CreateChangeProposal` | `MergeRequests.CreateMergeRequest` | MR, not PR | +| `GetPR` | `MergeRequests.GetMergeRequest` | | +| `ListRepoPullRequests` | `MergeRequests.ListProjectMergeRequests` | | +| `UpdatePR` | `MergeRequests.UpdateMergeRequest` | | +| `MergePR` | `MergeRequests.AcceptMergeRequest` | | +| `CreatePRComment` | `Notes.CreateMergeRequestNote` | Notes, not comments | +| `ListPRComments` | `Notes.ListMergeRequestNotes` | | +| `CreatePRReview` | Synthesized from notes + approvals | No native review object | +| `RequestPRReviewers` | `MergeRequestApprovals.SetApprovers` | Approvers, not reviewers | +| `ListPRReviews` | Synthesized from notes + approvals | | +| `GetPRDiff` | `MergeRequests.GetMergeRequestDiff` | | +| `AddLabels` | `MergeRequests.UpdateMergeRequest` or `Issues.UpdateIssue` | Labels in update payload | +| `RemoveLabel` | Same as above | Full label list minus removed | +| `CreateIssue` | `Issues.CreateIssue` | | +| `GetIssue` | `Issues.GetIssue` | | +| `ListIssues` | `Issues.ListProjectIssues` | | +| `UpdateIssue` | `Issues.UpdateIssue` | | +| `CreateIssueComment` | `Notes.CreateIssueNote` | | +| `ListIssueComments` | `Notes.ListIssueNotes` | | +| `CreateRepoSecret` | `ProjectVariables.CreateVariable` | With `Protected: true`, `Masked: true` | +| `DeleteRepoSecret` | `ProjectVariables.RemoveVariable` | | +| `CreateOrUpdateRepoVariable` | `ProjectVariables.CreateVariable` / `UpdateVariable` | | +| `IsProtectedBranch` | `ProtectedBranches.GetProtectedBranch` | 404 → not protected | +| `CreatePipelineSchedule` | `PipelineSchedules.CreatePipelineSchedule` | GitLab-specific | +| `DeletePipelineSchedule` | `PipelineSchedules.DeletePipelineSchedule` | GitLab-specific | +| `ListPipelineSchedules` | `PipelineSchedules.ListProjectPipelineSchedules` | For uninstall cleanup | +| `UpdateVariable` | `ProjectVariables.UpdateVariable` | For poll watermark | +| `CreateProtectedVariable` | `ProjectVariables.CreateVariable` | With `Protected: true`, `Masked: false` — for poll state | +| `DispatchWorkflow` | → `ErrNotSupported` | GitHub-only | +| `ListOrgInstallations` | → `GitHubExtensions` (not on base interface) | GitHub-only | +| `GetAppClientID` | → `GitHubExtensions` (not on base interface) | GitHub-only | +| `CreateOrgSecret` | → `ErrNotSupported` | Per-repo only | +| `OrgSecretExists` | → `ErrNotSupported` | Per-repo only | +| `GetLatestWorkflowRun` | → `ErrNotSupported` | GitHub Actions concept | +| `ListWorkflowRuns` | → `ErrNotSupported` | GitHub Actions concept | +| `CommitFiles` | `Commits.CreateCommit` | Multi-file commit | + +### Review synthesis + +GitLab has no native "review" object like GitHub's pull request review. Reviews are synthesized from: +- **Notes** with suggestion blocks → "changes requested" +- **Approval status** via `MergeRequestApprovals.GetConfiguration` → "approved" +- **Discussion resolution status** → tracks whether feedback has been addressed + +The `CreatePRReview` method posts a note and optionally approves/unapproves the MR. + +### Additional polling-support methods + +These are internal methods on the client struct (not on `forge.Client`), used by the poller: + +```go +func (c *LiveClient) ListIssuesUpdatedSince(ctx context.Context, owner, repo string, since time.Time) ([]Issue, error) +func (c *LiveClient) ListMergeRequestsUpdatedSince(ctx context.Context, owner, repo string, since time.Time) ([]MergeRequest, error) +func (c *LiveClient) ListProjectEvents(ctx context.Context, owner, repo string, targetType string, after time.Time) ([]Event, error) +func (c *LiveClient) ListIssueNotes(ctx context.Context, owner, repo string, issueIID int) ([]Note, error) +func (c *LiveClient) ListMergeRequestNotes(ctx context.Context, owner, repo string, mrIID int) ([]Note, error) +func (c *LiveClient) GetVariable(ctx context.Context, owner, repo, key string) (string, error) +func (c *LiveClient) GetAuthenticatedUser(ctx context.Context) (*User, error) // GET /user +func (c *LiveClient) CreateNoteAwardEmoji(ctx context.Context, owner, repo string, issueIID, noteID int, emoji string) error +``` + +### Subgroup path handling + +GitLab supports deeply nested namespaces (`org/sub1/sub2/project`). The client must URL-encode the full project path for API calls, or use numeric project IDs. The `GetRepo` method resolves `owner/repo` to a project ID, and subsequent calls use the numeric ID. + +### Files + +| Action | Path | +|--------|------| +| Create | `internal/forge/gitlab/gitlab.go` (~1500-2000 lines) | +| Create | `internal/forge/gitlab/gitlab_test.go` | + +## Phase 2: Cron Poller + +**Goal**: Implement the event polling logic that runs inside scheduled GitLab CI/CD pipelines. The poller is a Go package compiled into the `fullsend` binary and invoked via `fullsend poll`. No external infrastructure is required — no Cloud Function, no webhook bridge, no separate deployment. + +### Architecture + +``` +fullsend poll +├── Read FULLSEND_LAST_POLL_AT_{FAST,FULL} from CI variable +├── Query GitLab API for changes since last poll +│ ├── GET /projects/:id/issues?updated_after=T +│ ├── GET /projects/:id/merge_requests?updated_after=T +│ └── GET /projects/:id/events?target_type=note&after=D +├── For each changed item with new notes: +│ └── GET /projects/:id/issues/:iid/notes (or merge_requests/:iid/notes) +├── Apply event routing rules → list of (stage, event) pairs +├── Dispatch each via parent-child pipeline trigger +│ └── Create child pipeline with STAGE, EVENT_PAYLOAD_B64, RESOURCE_KEY +├── Update FULLSEND_LAST_POLL_AT_{FAST,FULL} via API +└── Exit +``` + +### Package structure + +``` +internal/poll/ +├── poll.go # Main poll loop +├── poll_test.go # Unit tests +├── events.go # Event detection and deduplication +├── events_test.go # Event detection tests +├── dispatch.go # Child pipeline triggering +└── state.go # Watermark state management +``` + +### CLI command + +New subcommand `fullsend poll` added to `internal/cli/`: + +```go +func newPollCmd() *cobra.Command { + return &cobra.Command{ + Use: "poll", + Short: "Poll GitLab API for new events and dispatch agent stages", + RunE: func(cmd *cobra.Command, args []string) error { + forgeToken := os.Getenv("FULLSEND_FORGE_TOKEN") + projectPath := os.Getenv("CI_PROJECT_PATH") + gcpProjectID := os.Getenv("FULLSEND_GCP_PROJECT_ID") + + client, err := gitlab.New(forgeToken, gitlabURL) + if err != nil { + return err + } + + botUser, err := client.GetAuthenticatedUser(cmd.Context()) + if err != nil { + return fmt.Errorf("identify bot user: %w", err) + } + + poller := poll.New(client, projectPath, poll.Options{ + SlashCommandsOnly: os.Getenv("FULLSEND_POLL_MODE") == "fast", + BotUserID: botUser.ID, + }) + + return poller.Run(cmd.Context()) + }, + } +} +``` + +### Poll loop (`poll.go`) + +```go +type Poller struct { + client *gitlab.LiveClient + projectPath string + owner string + repo string + botUserID int // GitLab user ID of the enrolled fullsend bot + opts Options + accessCache map[int]int // userID → access level, reset per poll cycle +} + +type Options struct { + SlashCommandsOnly bool // fast-poll mode: only check for /fs-* commands + BotUserID int // GitLab user ID of the enrolled fullsend bot +} + +func (p *Poller) Run(ctx context.Context) error { + p.owner, p.repo = splitOwnerRepo(p.projectPath) + p.botUserID = p.opts.BotUserID + p.accessCache = make(map[int]int) + + // 1. Read watermark + lastPollAt, err := p.readWatermark(ctx, p.owner, p.repo) + if err != nil { + return fmt.Errorf("read watermark: %w", err) + } + + // 2. Discover events + var events []RoutableEvent + var labelState LabelState // non-nil only for full polls + var minSkippedAt time.Time // earliest issue skipped due to note-fetch failure + if p.opts.SlashCommandsOnly { + events, err = p.discoverSlashCommands(ctx, p.owner, p.repo, lastPollAt) + } else { + events, labelState, minSkippedAt, err = p.discoverAllEvents(ctx, p.owner, p.repo, lastPollAt) + } + if err != nil { + return fmt.Errorf("discover events: %w", err) + } + + // 3. Deduplicate + events = p.deduplicate(events) + + // 4. Route and dispatch. + // Track maxUpdatedAt for successfully dispatched and unroutable events. + // Separately track minFailedAt — the earliest UpdatedAt among failed + // dispatches — so the watermark never advances past unprocessed events. + // Also incorporate minSkippedAt from discovery-time note-fetch failures. + dispatched := 0 + var maxUpdatedAt time.Time + var minFailedAt time.Time + failedLabelEvents := make(map[int]map[string]bool) // IID → labels whose dispatch failed + for _, event := range events { + stage := p.routeEvent(ctx, event) + if stage == "" { + if event.UpdatedAt.After(maxUpdatedAt) { + maxUpdatedAt = event.UpdatedAt + } + continue + } + + if err := p.dispatch(ctx, p.owner, p.repo, stage, event); err != nil { + log.Printf("dispatch %s for %s failed: %v", stage, event.Key(), err) + if minFailedAt.IsZero() || event.UpdatedAt.Before(minFailedAt) { + minFailedAt = event.UpdatedAt + } + if event.Type == "issue_label" { + if failedLabelEvents[event.IID] == nil { + failedLabelEvents[event.IID] = make(map[string]bool) + } + for _, label := range event.Labels { + failedLabelEvents[event.IID][label] = true + } + } + continue + } + dispatched++ + if event.UpdatedAt.After(maxUpdatedAt) { + maxUpdatedAt = event.UpdatedAt + } + // Acknowledge slash commands with a reaction so users know the + // command was picked up (avoids blind 5–60 min wait). + if event.NoteID != 0 && strings.HasPrefix(strings.TrimSpace(event.NoteBody), "/fs-") { + _ = p.client.CreateNoteAwardEmoji(ctx, p.owner, p.repo, event.IID, event.NoteID, "eyes") + } + } + + // 5. Update watermark (with 30s overlap for clock skew). + // Only fall back to time.Now() on a truly empty poll (no events + // discovered). When events exist but all dispatches failed, + // maxUpdatedAt stays zero and the watermark is not advanced — + // those events remain in the next poll's lookback window. + // In the mixed success/failure case, cap maxUpdatedAt at minFailedAt + // so the window always includes unprocessed failed events. + if maxUpdatedAt.IsZero() && len(events) == 0 { + maxUpdatedAt = time.Now() + } + if maxUpdatedAt.IsZero() { + log.Printf("WARNING: all %d dispatches failed, watermark not advanced", len(events)) + return nil + } + if !minFailedAt.IsZero() && minFailedAt.Before(maxUpdatedAt) { + maxUpdatedAt = minFailedAt + } + if !minSkippedAt.IsZero() && minSkippedAt.Before(maxUpdatedAt) { + maxUpdatedAt = minSkippedAt + } + newWatermark := maxUpdatedAt.Add(-30 * time.Second) + if err := p.updateWatermark(ctx, p.owner, p.repo, newWatermark); err != nil { + log.Printf("WARNING: failed to update watermark: %v", err) + } + + // 6. Persist label state after dispatch. + // Remove labels from failed dispatches so they remain "unseen" and + // are re-detected on the next poll cycle. + if labelState != nil { + for iid, failedLabels := range failedLabelEvents { + if current, ok := labelState[iid]; ok { + var kept []string + for _, label := range current { + if !failedLabels[label] { + kept = append(kept, label) + } + } + labelState[iid] = kept + } + } + p.persistLabelState(ctx, p.owner, p.repo, labelState) + } + + log.Printf("poll complete: %d events discovered, %d dispatched", len(events), dispatched) + return nil +} +``` + +### Event discovery (`events.go`) + +```go +type RoutableEvent struct { + Type string // "issue_label", "issue_note", "mr_note", "mr_event" + IID int // issue or MR IID + UpdatedAt time.Time + Labels []string // newly-added labels for issue_label; current labels for issue_note + NoteBody string // comment body (for slash command routing) + NoteID int // note ID (for dedup) + NoteAuthorID int // note author user ID (for authorization checks) + IsBot bool // whether the note author is a bot + MRSource int // source project ID (for fork MR protection) + MRTarget int // target project ID (for fork MR protection) +} + +// discoverAllEvents returns: +// - events: all routable events found since the given time +// - labelState: updated label state for persistence (with skipped issues restored) +// - minSkippedAt: earliest UpdatedAt among issues skipped due to note-fetch +// failures (zero if none skipped); the caller must cap the watermark at this +// value so skipped events are retried on the next poll +// - error +func (p *Poller) discoverAllEvents(ctx context.Context, owner, repo string, since time.Time) ([]RoutableEvent, LabelState, time.Time, error) { + var events []RoutableEvent + + // 1. Issues updated since last poll + issues, err := p.client.ListIssuesUpdatedSince(ctx, owner, repo, since) + if err != nil { + return nil, nil, time.Time{}, fmt.Errorf("list issues: %w", err) + } + + // Detect newly-added labels (state diff against previous poll). + // On error, abort — continuing with nil newLabels would silently + // drop all label-based events while the watermark advances past them. + // Label state is NOT persisted here — the caller persists after + // dispatch so that failed dispatches are re-detected next poll. + newLabels, updatedLabelState, previousLabels, err := p.detectNewLabels(ctx, owner, repo, issues) + if err != nil { + return nil, nil, time.Time{}, fmt.Errorf("detect new labels: %w", err) + } + + var minSkippedAt time.Time // earliest UpdatedAt among skipped issues + for _, issue := range issues { + // Fetch notes first — if this fails, skip the entire issue + // (including label events) so that neither notes nor labels + // advance maxUpdatedAt past events we couldn't fully discover. + notes, err := p.client.ListIssueNotes(ctx, owner, repo, issue.IID) + if err != nil { + log.Printf("list notes for issue %d: %v (skipping issue entirely)", issue.IID, err) + // Restore this issue's previous label state so its labels + // remain "unseen" — detectNewLabels already marked them as + // seen in updatedLabelState, but we never emitted events. + if prev, ok := previousLabels[issue.IID]; ok { + updatedLabelState[issue.IID] = prev + } else { + delete(updatedLabelState, issue.IID) + } + if minSkippedAt.IsZero() || issue.UpdatedAt.Before(minSkippedAt) { + minSkippedAt = issue.UpdatedAt + } + continue + } + + // Check for label-based triggers — one event per newly-added + // routable label so that multiple labels in the same poll window + // each dispatch independently. + if added, ok := newLabels[issue.IID]; ok { + for _, label := range added { + events = append(events, RoutableEvent{ + Type: "issue_label", + IID: issue.IID, + UpdatedAt: issue.UpdatedAt, + Labels: []string{label}, + }) + } + } + for _, note := range notes { + if note.CreatedAt.Before(since) { + continue // skip old notes (client-side filtering) + } + events = append(events, RoutableEvent{ + Type: "issue_note", + IID: issue.IID, + UpdatedAt: note.CreatedAt, + NoteBody: note.Body, + NoteID: note.ID, + NoteAuthorID: note.Author.ID, + IsBot: note.Author.Bot, + Labels: issue.Labels, + }) + } + } + + // 2. MRs updated since last poll (for MR comment-triggered events only — + // MR open/update/merge are handled by native CI, not the poller). + // A persistent MR API failure must not block issue event processing, + // so we log and continue with issue-only events rather than aborting. + mrs, err := p.client.ListMergeRequestsUpdatedSince(ctx, owner, repo, since) + if err != nil { + log.Printf("list merge requests: %v (continuing with issue events only)", err) + if minSkippedAt.IsZero() || since.Before(minSkippedAt) { + minSkippedAt = since + } + return events, updatedLabelState, minSkippedAt, nil + } + + for _, mr := range mrs { + notes, err := p.client.ListMergeRequestNotes(ctx, owner, repo, mr.IID) + if err != nil { + log.Printf("list notes for MR %d: %v (skipping MR entirely)", mr.IID, err) + if minSkippedAt.IsZero() || mr.UpdatedAt.Before(minSkippedAt) { + minSkippedAt = mr.UpdatedAt + } + continue + } + for _, note := range notes { + if note.CreatedAt.Before(since) { + continue + } + events = append(events, RoutableEvent{ + Type: "mr_note", + IID: mr.IID, + UpdatedAt: note.CreatedAt, + NoteBody: note.Body, + NoteID: note.ID, + NoteAuthorID: note.Author.ID, + IsBot: note.Author.Bot, + MRSource: mr.SourceProjectID, + MRTarget: mr.TargetProjectID, + }) + } + } + + return events, updatedLabelState, minSkippedAt, nil +} + +// isProjectAccessTokenBot detects GitLab project access token bot users. +// GitLab's Events API author object does not include a `bot` field, so +// fast-poll mode uses this username heuristic. Full-poll mode uses the +// Notes API `Author.Bot` field instead (more reliable). This inconsistency +// is accepted: fast-poll only handles slash commands, not changes-requested +// markers, limiting the blast radius of a false negative. +func isProjectAccessTokenBot(username string) bool { + return strings.HasPrefix(username, "project_") && strings.Contains(username, "_bot_") +} + +func (p *Poller) discoverSlashCommands(ctx context.Context, owner, repo string, since time.Time) ([]RoutableEvent, error) { + // Fast-poll mode: use the Events API to find new notes only. + // This avoids querying all issues/MRs — just look for note-type events. + // + // GitLab Events API response fields used: + // evt.Note.NoteableType → "Issue" | "MergeRequest" (mapped to internal event types) + // evt.Note.NoteableIID → issue/MR IID + // evt.Note.Body → comment text (checked for /fs-* prefix) + // evt.Note.ID → note ID + // evt.Author.ID → author user ID (for authorization check) + // evt.Author.Username → username (for bot detection via pattern match) + // evt.CreatedAt → event timestamp + projectEvents, err := p.client.ListProjectEvents(ctx, owner, repo, "Note", since) + if err != nil { + return nil, fmt.Errorf("list note events: %w", err) + } + + var events []RoutableEvent + for _, evt := range projectEvents { + if evt.CreatedAt.Before(since) { + continue // client-side filtering (Events API after= is date-only) + } + // Only include notes that look like slash commands + if !strings.HasPrefix(strings.TrimSpace(evt.Note.Body), "/fs-") { + continue + } + // Normalize NoteableType to internal event type constants. + // GitLab returns capitalized values ("Issue", "MergeRequest"). + var eventType string + switch evt.Note.NoteableType { + case "Issue": + eventType = "issue_note" + case "MergeRequest": + eventType = "mr_note" + default: + continue + } + events = append(events, RoutableEvent{ + Type: eventType, + IID: evt.Note.NoteableIID, + UpdatedAt: evt.CreatedAt, + NoteBody: evt.Note.Body, + NoteID: evt.Note.ID, + NoteAuthorID: evt.Author.ID, + IsBot: isProjectAccessTokenBot(evt.Author.Username), + }) + } + + return events, nil +} +``` + +### Event routing + +```go +func (p *Poller) routeEvent(ctx context.Context, event RoutableEvent) string { + switch event.Type { + case "issue_label": + return p.routeIssueLabel(event) + case "issue_note": + return p.routeIssueNote(ctx, event) + case "mr_note": + return p.routeMRNote(ctx, event) + default: + return "" + } +} + +// routeIssueLabel maps label additions to stages. +// No per-user authorization check: adding labels requires Reporter+ +// access in GitLab, which is sufficient authorization for triggering +// agent stages. This is intentionally less restrictive than slash +// commands (which require Developer+) because label management is +// a structured workflow action, not free-form command execution. +func (p *Poller) routeIssueLabel(event RoutableEvent) string { + for _, label := range event.Labels { + switch label { + case "ready-to-code": + return "code" + case "ready-for-review": + return "review" + } + } + return "" +} + +var routableLabels = map[string]bool{ + "ready-to-code": true, + "ready-for-review": true, + "needs-info": true, +} + +func filterRoutableLabels(labels []string) []string { + var out []string + for _, l := range labels { + if routableLabels[l] { + out = append(out, l) + } + } + return out +} + +// commandToken extracts the first whitespace-delimited token from body. +// Used for exact slash command matching — prevents "/fs-fix" from matching +// "/fs-fixed" or "/fs-fixer". +func commandToken(body string) string { + if i := strings.IndexFunc(body, unicode.IsSpace); i > 0 { + return body[:i] + } + return body +} + +func (p *Poller) routeIssueNote(ctx context.Context, event RoutableEvent) string { + if event.IsBot { + return "" // skip bot comments to prevent re-triggering + } + + // Slash commands require Developer-level (30+) access to prevent + // Guest/Reporter users from triggering agent stages. + body := strings.TrimSpace(event.NoteBody) + cmd := commandToken(body) + if strings.HasPrefix(cmd, "/fs-") { + if !p.hasWriteAccess(ctx, event.NoteAuthorID) { + log.Printf("slash command from user %d denied: insufficient permissions", event.NoteAuthorID) + return "" + } + } + switch cmd { + case "/fs-triage": + return "triage" + case "/fs-code": + return "code" + case "/fs-review": + return "review" + case "/fs-fix": + return "fix" + case "/fs-retro": + return "retro" + case "/fs-prioritize": + return "prioritize" + default: + // Unrecognized /fs-* commands are no-ops — don't fall through + // to the needs-info check, which would trigger triage for what + // the user intended as a (non-existent) slash command. + if strings.HasPrefix(cmd, "/fs-") { + return "" + } + // Non-command comment on issue with needs-info label → triage. + // No authorization check: this path is intentionally open to all + // commenters (Guest+). The user is providing information that was + // explicitly requested via the needs-info label, and triage is a + // read-only assessment — it does not modify repository contents. + // This is less restrictive than slash commands (Developer+) because + // the trigger is a structured workflow response, not free-form + // command execution. See Security-Critical Code Path #5. + for _, label := range event.Labels { + if label == "needs-info" { + return "triage" + } + } + return "" + } +} + +func (p *Poller) routeMRNote(ctx context.Context, event RoutableEvent) string { + if event.IsBot { + // Only the enrolled fullsend bot's changes-requested markers + // trigger fix. Verify the author ID matches the project's + // configured bot user to prevent other project access token + // bots from triggering fix stage. + if strings.Contains(event.NoteBody, "") { + if event.NoteAuthorID != p.botUserID { + return "" + } + if p.isForkMR(event) { + return "" // skip fork MRs + } + return "fix" + } + return "" + } + + // Slash commands require Developer-level (30+) access to prevent + // Guest/Reporter users from triggering agent stages. + body := strings.TrimSpace(event.NoteBody) + cmd := commandToken(body) + if strings.HasPrefix(cmd, "/fs-") { + if !p.hasWriteAccess(ctx, event.NoteAuthorID) { + log.Printf("slash command from user %d denied: insufficient permissions", event.NoteAuthorID) + return "" + } + } + var stage string + switch cmd { + case "/fs-triage": + stage = "triage" + case "/fs-code": + stage = "code" + case "/fs-review": + stage = "review" + case "/fs-fix": + stage = "fix" + case "/fs-retro": + stage = "retro" + case "/fs-prioritize": + stage = "prioritize" + default: + return "" + } + + // Fork MR protection: deny fix/code on fork MRs (or when fork + // status is unknown, e.g. fast-poll path where MRSource/MRTarget + // are not populated). + if (stage == "fix" || stage == "code") && p.isForkMR(event) { + return "" + } + return stage +} + +// isForkMR returns true if the MR is a fork (source != target) OR if +// fork status is unknown (zero-valued fields). Deny-by-default: when +// the fast-poll path omits MRSource/MRTarget, fork-sensitive stages +// (fix, code) are blocked rather than silently allowed. +func (p *Poller) isForkMR(event RoutableEvent) bool { + if event.MRSource == 0 || event.MRTarget == 0 { + return true // unknown — deny by default + } + return event.MRSource != event.MRTarget +} +``` + +### Authorization + +```go +// hasWriteAccess checks whether a user has Developer-level (30+) access +// to the project. Results are cached per poll cycle. +func (p *Poller) hasWriteAccess(ctx context.Context, userID int) bool { + if access, ok := p.accessCache[userID]; ok { + return access >= 30 // Developer = 30, Maintainer = 40, Owner = 50 + } + + // Use /members/all/ to include inherited group members, not just + // direct project members. + member, err := p.client.GetProjectMemberAll(ctx, p.owner, p.repo, userID) + if err != nil { + log.Printf("check member access for user %d: %v (denying)", userID, err) + p.accessCache[userID] = 0 + return false + } + p.accessCache[userID] = member.AccessLevel + return member.AccessLevel >= 30 +} +``` + +### Deduplication + +```go +func (p *Poller) deduplicate(events []RoutableEvent) []RoutableEvent { + seen := make(map[string]bool) + var unique []RoutableEvent + + for _, event := range events { + key := event.Key() + if seen[key] { + continue + } + seen[key] = true + unique = append(unique, event) + } + + return unique +} + +func (e RoutableEvent) Key() string { + if e.NoteID != 0 { + return fmt.Sprintf("note-%d", e.NoteID) + } + return fmt.Sprintf("%s-%d-%s", e.Type, e.IID, strings.Join(e.Labels, ",")) +} +``` + +### Label state tracking + +The poller needs to distinguish "label was just added" from "label was already present". Since polling sees only current state (no `changes` object like webhook payloads provide), label change detection is implemented client-side via state comparison. + +**Approach**: Store the set of previously-seen labels per issue in a CI/CD variable (`FULLSEND_LABEL_STATE`), encoded as JSON. On each poll, diff current labels against stored state. Only newly-appearing labels trigger routing. + +```go +type LabelState map[int][]string // issue IID → label list + +// detectNewLabels returns: +// - newLabels: map of issue IID → newly-added labels +// - updatedState: label state with all current labels marked as "seen" +// - previousLabels: snapshot of each issue's previous labels (before update), +// so the caller can restore entries for issues that couldn't be fully processed +// - error +func (p *Poller) detectNewLabels(ctx context.Context, owner, repo string, issues []Issue) (map[int][]string, LabelState, map[int][]string, error) { + // Read stored state + stateJSON, err := p.client.GetVariable(ctx, owner, repo, "FULLSEND_LABEL_STATE") + if err != nil { + if errors.Is(err, forge.ErrNotFound) { + stateJSON = "{}" // first run — all labels are "new" + } else { + return nil, nil, nil, fmt.Errorf("read label state: %w", err) + } + } + + var previousState LabelState + if err := json.Unmarshal([]byte(stateJSON), &previousState); err != nil { + // Graceful degradation: if stored JSON is corrupt or truncated + // (e.g., exceeding GitLab's 10,000-char variable limit), fall back + // to empty state — all current labels will be treated as "new," + // causing duplicate dispatches mitigated by resource_group. + log.Warn("unmarshal label state failed, resetting to empty", "error", err) + previousState = make(LabelState) + } + + newLabels := make(map[int][]string) + previousLabels := make(map[int][]string) // snapshot before update + + // Merge into previous state rather than replacing — only update entries + // for issues present in the current poll, retaining entries for issues + // not in the current result set. This prevents spurious "new label" + // detections when a previously-tracked issue reappears after being + // absent from the updated_after window. + for _, issue := range issues { + prev := previousState[issue.IID] + previousLabels[issue.IID] = prev // snapshot for rollback + prevSet := toSet(prev) + + // Only track fullsend-routable labels to keep state bounded + // within GitLab's 10,000-character CI/CD variable limit. + routable := filterRoutableLabels(issue.Labels) + for _, label := range routable { + if !prevSet[label] { + newLabels[issue.IID] = append(newLabels[issue.IID], label) + } + } + + // Update this issue's entry with only routable labels + previousState[issue.IID] = routable + } + + // Prune closed issues to keep state bounded. + // Skip IIDs in the current poll set — their state was just updated + // and should not be pruned even if newly closed. + polledIIDs := make(map[int]bool, len(issues)) + for _, issue := range issues { + polledIIDs[issue.IID] = true + } + for iid := range previousState { + if !polledIIDs[iid] && p.isIssueClosed(ctx, owner, repo, iid) { + delete(previousState, iid) + } + } + + // Return newLabels, updated state, and previous labels WITHOUT persisting. + // The caller filters out labels from failed dispatches and restores + // entries for skipped issues before persisting. + return newLabels, previousState, previousLabels, nil +} + +func (p *Poller) persistLabelState(ctx context.Context, owner, repo string, state LabelState) { + stateBytes, err := json.Marshal(state) + if err != nil { + log.Printf("WARNING: failed to marshal label state: %v", err) + return + } + if err := p.client.UpdateVariable(ctx, owner, repo, "FULLSEND_LABEL_STATE", string(stateBytes)); err != nil { + log.Printf("WARNING: failed to persist label state: %v", err) + } +} +``` + +**CI/CD variable size limit**: GitLab CI/CD variables have a 10,000-character limit. For projects with many issues, the label state JSON may exceed this. Mitigation: only track issues with fullsend-relevant labels (`fullsend:*`), and prune entries for closed issues on each poll. If the state exceeds the limit, fall back to treating all matching labels as "new" (which may cause duplicate dispatches, handled by `resource_group` concurrency control). + +### Watermark state management (`state.go`) + +```go +func (p *Poller) readWatermark(ctx context.Context, owner, repo string) (time.Time, error) { + varName := p.watermarkVarName() + value, err := p.client.GetVariable(ctx, owner, repo, varName) + if err != nil { + if errors.Is(err, forge.ErrNotFound) { + return time.Now().Add(-1 * time.Hour), nil + } + return time.Time{}, fmt.Errorf("read watermark %s: %w", varName, err) + } + return time.Parse(time.RFC3339, value) +} + +func (p *Poller) watermarkVarName() string { + if p.opts.SlashCommandsOnly { + return "FULLSEND_LAST_POLL_AT_FAST" + } + return "FULLSEND_LAST_POLL_AT_FULL" +} + +func (p *Poller) updateWatermark(ctx context.Context, owner, repo string, t time.Time) error { + return p.client.UpdateVariable(ctx, owner, repo, p.watermarkVarName(), t.Format(time.RFC3339)) +} +``` + +### Child pipeline dispatch (`dispatch.go`) + +The poller dispatches agent stages by generating a child pipeline YAML file. The parent pipeline (poll.yml) uses `trigger: include: artifact:` to start child pipelines from the generated YAML. This keeps everything within GitLab's native pipeline hierarchy without requiring trigger tokens. + +**Retry coverage boundary**: The watermark and label-state retry mechanisms (steps 5–6 in the poll loop) protect against poll-time failures — specifically, file I/O errors when writing `dispatches.json` via `appendDispatch`. They do NOT cover child pipeline runtime failures (agent crash, credential issue, transient API error), because the watermark advances as soon as the poll job completes successfully, before child pipelines execute. For child pipeline failures, the retry strategy is: (1) GitLab's native `retry:` keyword on child pipeline jobs for transient errors, (2) manual re-trigger via the GitLab UI or `/fs-*` slash command for persistent failures, (3) `resource_group` concurrency control ensures re-triggered stages don't conflict with in-progress runs. + +```go +type Dispatch struct { + Stage string `json:"stage"` + EventType string `json:"event_type"` + EventPayloadB64 string `json:"event_payload_b64"` + ResourceKey string `json:"resource_key"` +} + +func (p *Poller) dispatch(ctx context.Context, owner, repo, stage string, event RoutableEvent) error { + // Build minimal event payload + payload := p.buildEventPayload(event) + payloadB64 := base64.StdEncoding.EncodeToString(payload) + + dispatch := Dispatch{ + Stage: stage, + EventType: event.Type, + EventPayloadB64: payloadB64, + ResourceKey: fmt.Sprintf("%s-%d", event.Type, event.IID), + } + + // Append to dispatches list. The --output flag writes all accumulated + // dispatches as a JSON array (not NDJSON) so that downstream jq + // commands like `jq 'length'` work correctly. + if err := p.appendDispatch(dispatch); err != nil { + return fmt.Errorf("append dispatch: %w", err) + } + return nil +} +``` + +**Child pipeline YAML generation:** + +```go +func (p *Poller) generateChildPipelineYAML(dispatches []Dispatch) string { + var buf bytes.Buffer + for i, d := range dispatches { + fmt.Fprintf(&buf, "agent-%d:\n", i) + fmt.Fprintf(&buf, " trigger:\n") + fmt.Fprintf(&buf, " include: .gitlab/ci/fullsend-%s.yml\n", d.Stage) + fmt.Fprintf(&buf, " strategy: depend\n") + fmt.Fprintf(&buf, " variables:\n") + fmt.Fprintf(&buf, " STAGE: %q\n", d.Stage) + fmt.Fprintf(&buf, " EVENT_TYPE: %q\n", d.EventType) + fmt.Fprintf(&buf, " EVENT_PAYLOAD_B64: %q\n", d.EventPayloadB64) + fmt.Fprintf(&buf, " RESOURCE_KEY: %q\n", d.ResourceKey) + fmt.Fprintf(&buf, " rules:\n") + fmt.Fprintf(&buf, " - when: always\n") + } + return buf.String() +} +``` + +### Files + +| Action | Path | +|--------|------| +| Create | `internal/poll/poll.go` (~300 lines) | +| Create | `internal/poll/poll_test.go` | +| Create | `internal/poll/events.go` (~250 lines) | +| Create | `internal/poll/events_test.go` | +| Create | `internal/poll/dispatch.go` (~150 lines) | +| Create | `internal/poll/state.go` (~80 lines) | +| Modify | `internal/cli/root.go` — add `poll` subcommand | + +## Phase 3: GitLab CI/CD Templates + +**Goal**: Create pipeline YAML templates that are committed to enrolled projects during install. + +### Directory structure + +``` +internal/scaffold/fullsend-repo-gitlab/ +├── .gitlab-ci.yml +├── .gitlab/ +│ └── ci/ +│ ├── fullsend-dispatch.yml ← MR event routing (native CI path) +│ ├── fullsend-poll.yml ← cron poller (scheduled pipeline) +│ ├── fullsend-triage.yml +│ ├── fullsend-code.yml +│ ├── fullsend-review.yml +│ ├── fullsend-fix.yml +│ ├── fullsend-retro.yml +│ └── fullsend-prioritize.yml +└── .fullsend/ + ├── config.yaml + └── customized/ + ├── agents/.gitkeep + ├── harness/.gitkeep + ├── policies/.gitkeep + ├── skills/.gitkeep + └── scripts/.gitkeep +``` + +### Root pipeline (`.gitlab-ci.yml`) + +```yaml +include: + - local: '.gitlab/ci/fullsend-dispatch.yml' + rules: + - if: $CI_PIPELINE_SOURCE == "merge_request_event" + - local: '.gitlab/ci/fullsend-poll.yml' + rules: + - if: $CI_PIPELINE_SOURCE == "schedule" + - local: '.gitlab/ci/fullsend-triage.yml' + - local: '.gitlab/ci/fullsend-code.yml' + - local: '.gitlab/ci/fullsend-review.yml' + - local: '.gitlab/ci/fullsend-fix.yml' + - local: '.gitlab/ci/fullsend-retro.yml' + - local: '.gitlab/ci/fullsend-prioritize.yml' + +stages: + - dispatch + - poll + - generate + - agent + +workflow: + rules: + # Native MR events (review, retro) + - if: $CI_PIPELINE_SOURCE == "merge_request_event" + # Scheduled polling (triage, code, slash commands) + - if: $CI_PIPELINE_SOURCE == "schedule" && $CI_COMMIT_REF_PROTECTED == "true" + # Child pipelines dispatched by the poller + - if: $CI_PIPELINE_SOURCE == "parent_pipeline" +``` + +### MR dispatch (`.gitlab/ci/fullsend-dispatch.yml`) + +Handles native MR events — routes `merge_request_event` pipelines to the appropriate agent stage: + +```yaml +# fullsend-stage: dispatch (MR events only) + +dispatch: + stage: dispatch + image: ghcr.io/fullsend-ai/fullsend-sandbox:latest + rules: + - if: $CI_PIPELINE_SOURCE == "merge_request_event" + script: + - | + set -euo pipefail + + # CI_DEBUG_TRACE guard + if [ "${CI_DEBUG_TRACE:-}" = "true" ]; then + echo "ERROR: CI_DEBUG_TRACE enabled — aborting to protect secrets" + exit 1 + fi + + # GitLab has no CI_MERGE_REQUEST_EVENT_TYPE predefined variable. + # Determine the MR action by querying its state via the API. + MR_STATE=$(curl -sf "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}" \ + -H "JOB-TOKEN: ${CI_JOB_TOKEN}" | jq -r '.state') + + case "${MR_STATE}" in + merged) + echo "STAGE=retro" >> dispatch.env + echo "RESOURCE_KEY=mr-${CI_MERGE_REQUEST_IID}" >> dispatch.env + ;; + opened) + echo "STAGE=review" >> dispatch.env + echo "RESOURCE_KEY=mr-${CI_MERGE_REQUEST_IID}" >> dispatch.env + ;; + *) + echo "Unhandled MR state: ${MR_STATE}" + touch dispatch.env + exit 0 + ;; + esac + artifacts: + reports: + dotenv: dispatch.env +``` + +### Cron poller pipeline (`.gitlab/ci/fullsend-poll.yml`) + +```yaml +# fullsend-stage: poll + +poll-events: + stage: poll + image: ghcr.io/fullsend-ai/fullsend-sandbox:latest + resource_group: fullsend-poll + rules: + - if: $CI_PIPELINE_SOURCE == "schedule" && $CI_COMMIT_REF_PROTECTED == "true" + id_tokens: + FULLSEND_ID_TOKEN: + aud: "fullsend" + variables: + FULLSEND_FORGE: "gitlab" + script: + - | + set -euo pipefail + + # CI_DEBUG_TRACE guard — critical in variable mode (sole defense + # against PAT exposure at job init), defense-in-depth in WIF mode. + if [ "${CI_DEBUG_TRACE:-}" = "true" ]; then + echo "ERROR: CI_DEBUG_TRACE enabled — aborting to protect secrets" + exit 1 + fi + + # Credential retrieval — mode selected at install time + if [ "${FULLSEND_CREDENTIAL_MODE}" = "wif" ]; then + # WIF mode: exchange OIDC token for GCP credentials, then + # retrieve bot PAT from Secret Manager + gcloud auth login --cred-file=<(cat < child-pipeline.yml + exit 0 + fi + + # Generate child pipeline YAML from dispatches + fullsend poll generate-child-pipeline \ + --dispatches dispatches.json \ + --output child-pipeline.yml + artifacts: + paths: + - child-pipeline.yml + expire_in: 1 hour + +# Trigger child pipelines for each dispatched event +dispatch-agents: + stage: agent + rules: + - if: $CI_PIPELINE_SOURCE == "schedule" && $CI_COMMIT_REF_PROTECTED == "true" + needs: + - job: generate-child-pipelines + artifacts: true + trigger: + include: + - artifact: child-pipeline.yml + job: generate-child-pipelines + strategy: depend +``` + +### Stage pipeline template (`.gitlab/ci/fullsend-code.yml`) + +All stages use the same credential retrieval flow (WIF or variable mode). Events arrive via parent pipeline variables (from the poller's child pipeline) or via native MR event dispatch. + +```yaml +# fullsend-stage: code + +code: + stage: agent + image: ghcr.io/fullsend-ai/fullsend-code:latest + id_tokens: + FULLSEND_ID_TOKEN: + aud: "fullsend" + variables: + FULLSEND_FORGE: "gitlab" + script: + - | + set -euo pipefail + + # CI_DEBUG_TRACE guard + if [[ "${CI_DEBUG_TRACE:-}" == "true" ]]; then + echo "ERROR: CI_DEBUG_TRACE enabled — aborting to protect secrets" + exit 1 + fi + + # Credential retrieval — mode selected at install time + if [ "${FULLSEND_CREDENTIAL_MODE}" = "wif" ]; then + gcloud auth login --cred-file=<(cat < "${EVENT_PAYLOAD_FILE}" + + # Prepare workspace (layered content resolution) + fullsend workspace prepare \ + --forge gitlab \ + --root .fullsend + + # Run the agent + fullsend run \ + --stage code \ + --source-project "${CI_PROJECT_PATH}" \ + --event-type "${EVENT_TYPE}" \ + --event-payload-file "${EVENT_PAYLOAD_FILE}" \ + --forge gitlab \ + --fullsend-dir .fullsend + resource_group: "fullsend-code-${RESOURCE_KEY}" + rules: + - if: $STAGE == "code" +``` + +### Stage-specific notes + +**fix**: Adds fork MR protection: +```yaml + - | + # Fork MR protection + SOURCE_PROJECT=$(echo "${EVENT_PAYLOAD_B64}" | base64 -d | jq -r '.mr_source_project_id // empty') + TARGET_PROJECT=$(echo "${EVENT_PAYLOAD_B64}" | base64 -d | jq -r '.mr_target_project_id // empty') + if [ -n "${SOURCE_PROJECT}" ] && [ -n "${TARGET_PROJECT}" ] && [ "${SOURCE_PROJECT}" != "${TARGET_PROJECT}" ]; then + echo "Fork MR detected — skipping fix stage" + exit 0 + fi +``` + +**review** (via native MR event): When triggered by `merge_request_event`, `CI_MERGE_REQUEST_IID` and other MR variables are available directly from GitLab — no event payload decoding needed. The stage template detects the source and adapts: +```yaml + - | + if [ "${CI_PIPELINE_SOURCE}" = "merge_request_event" ]; then + # Native MR event — build payload from CI variables + # Query MR state since GitLab has no CI_MERGE_REQUEST_EVENT_TYPE variable + MR_STATE=$(curl -sf "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/merge_requests/${CI_MERGE_REQUEST_IID}" \ + -H "JOB-TOKEN: ${CI_JOB_TOKEN}" | jq -r '.state') + EVENT_PAYLOAD_FILE=$(mktemp) + trap 'rm -f "${EVENT_PAYLOAD_FILE}"' EXIT + jq -n \ + --arg iid "${CI_MERGE_REQUEST_IID}" \ + --arg state "${MR_STATE}" \ + --arg source "${CI_MERGE_REQUEST_SOURCE_BRANCH_NAME}" \ + --arg target "${CI_MERGE_REQUEST_TARGET_BRANCH_NAME}" \ + '{iid: ($iid|tonumber), state: $state, source_branch: $source, target_branch: $target}' \ + > "${EVENT_PAYLOAD_FILE}" + EVENT_TYPE="merge_request" + else + # Polled event — decode from base64 variable + EVENT_PAYLOAD_FILE=$(mktemp) + trap 'rm -f "${EVENT_PAYLOAD_FILE}"' EXIT + echo "${EVENT_PAYLOAD_B64}" | base64 -d > "${EVENT_PAYLOAD_FILE}" + fi +``` + +### Files + +| Action | Path | +|--------|------| +| Create | `internal/scaffold/fullsend-repo-gitlab/` (entire tree) | +| Modify | `internal/scaffold/scaffold.go` — add `GitLabPerRepoScaffold()` function | + +## Phase 4: CLI Changes + +**Goal**: `fullsend admin install group/project --forge gitlab` works end-to-end. + +### New flags + +On `fullsend admin install`: +- `--forge {github|gitlab}` — auto-detected from remote URL, overridable +- `--gitlab-url` — GitLab instance URL (default: `https://gitlab.com`) +- `--poll-interval` — cron schedule for polling (default: auto-detect from tier) +- `--skip-schedule-create` — skip pipeline schedule creation (for externally managed schedules) + +### Token resolution + +```go +func resolveGitLabToken() (string, error) { + if token := os.Getenv("GL_TOKEN"); token != "" { + return token, nil + } + if token := os.Getenv("GITLAB_TOKEN"); token != "" { + return token, nil + } + out, err := exec.Command("glab", "auth", "token").Output() + if err == nil { + token := strings.TrimSpace(string(out)) + if token != "" { + return token, nil + } + } + return "", fmt.Errorf("no GitLab token found: set GL_TOKEN, GITLAB_TOKEN, or run 'glab auth login'") +} +``` + +### Per-repo enforcement + +`fullsend admin install testgroup --forge gitlab` returns an error: "GitLab installation supports per-repo mode only. Provide a group/project path." + +### GitLab per-repo install flow + +```go +func runGitLabPerRepoInstall(ctx context.Context, target string, opts installOpts) error { + // 1. Parse group/project + owner, repo := splitOwnerRepo(target) + + // 2. Resolve token + token, err := resolveGitLabToken() + + // 3. Create forge client (admin token for setup operations) + client, err := gitlab.New(token, opts.gitlabURL) + + // 4. Validate project + project, err := client.GetRepo(ctx, owner, repo) + // Check user has Maintainer access + // Check default branch exists + + // 5. Validate default branch is protected + protected, err := client.IsProtectedBranch(ctx, owner, repo, project.DefaultBranch) + if !protected { + return fmt.Errorf("default branch %q is not protected — protect it before installing fullsend", project.DefaultBranch) + } + + // 6. Check CI_DEBUG_TRACE is not enabled at project or group level. + // GET /projects/:id/variables/CI_DEBUG_TRACE — if exists and value == "true", fail. + // Also check group-level: GET /groups/:id/variables for each ancestor group. + // In variable mode, the script-level guard cannot prevent PAT exposure + // because GitLab logs CI/CD variables at job init before any script runs. + // Document that a Maintainer re-adding CI_DEBUG_TRACE after install (at + // any level) bypasses the guard in variable mode. + + // 7. Create Project Access Token (Developer, api scope) + // POST /projects/:id/access_tokens + botPAT := createProjectAccessToken(ctx, client, owner, repo) + + // 8. Store bot PAT — mode depends on --gcp-project flag + credentialMode := "variable" // default: no GCP required + if opts.gcpProject != "" { + credentialMode = "wif" + // Store PAT in GCP Secret Manager + storePATInSecretManager(ctx, opts.gcpProject, owner, repo, botPAT) + } else { + // Store PAT as a protected, masked CI/CD variable + client.CreateRepoSecret(ctx, owner, repo, "FULLSEND_FORGE_TOKEN", botPAT) + maintainerCount := countMaintainers(ctx, client, owner, repo) + if maintainerCount > 1 { + log.Warn("Variable mode selected with %d Maintainers. Any Maintainer can "+ + "enable CI_DEBUG_TRACE after install, exposing the bot PAT in job logs. "+ + "Consider using --gcp-project for WIF mode instead.", maintainerCount) + } + } + + // 9. Detect GitLab tier for poll interval configuration + tier := detectGitLabTier(ctx, client, owner, repo) + pollInterval := determinePollInterval(tier, opts.pollInterval) + // Free tier: "0 * * * *" (hourly) + // Premium+: "*/5 * * * *" (every 5 minutes) + + // 10. Create pipeline schedule(s) + if !opts.skipScheduleCreate { + if tier == "premium" || tier == "ultimate" { + // Fast poll: every 5 minutes, slash commands only + client.CreatePipelineSchedule(ctx, owner, repo, project.DefaultBranch, + "fullsend fast poll", "*/5 * * * *", + map[string]string{"FULLSEND_POLL_MODE": "fast"}) + // Slow poll: every 15 minutes, full event scan + client.CreatePipelineSchedule(ctx, owner, repo, project.DefaultBranch, + "fullsend full poll", "*/15 * * * *", + map[string]string{"FULLSEND_POLL_MODE": "full"}) + } else { + // Free tier: single hourly poll + client.CreatePipelineSchedule(ctx, owner, repo, project.DefaultBranch, + "fullsend poll", "0 * * * *", nil) + } + } + + // 11. Commit CI/CD template files + scaffoldFiles := scaffold.GitLabPerRepoScaffold() + client.CommitFilesToBranch(ctx, owner, repo, project.DefaultBranch, + "chore: add fullsend CI/CD pipeline", scaffoldFiles) + + // 12. Set protected CI/CD variables. + // Use CreateProtectedVariable (Protected: true, Masked: false) for + // configuration identifiers — CreateRepoSecret (Protected + Masked) + // requires values >= 8 characters (e.g. "wif" would fail) and masks + // GCP resource names in logs, hindering debugging. + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_CREDENTIAL_MODE", credentialMode) + if credentialMode == "wif" { + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_WIF_PROVIDER", wifProviderResourceName) + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_SA", serviceAccountEmail) + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_BOT_TOKEN_SECRET", secretManagerSecretName) + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_GCP_PROJECT_ID", opts.gcpProject) + } + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_FORGE", "gitlab") + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_PER_REPO_INSTALL", "true") + + // 13. Initialize poll watermarks (protected — must not be accessible + // to pipelines on non-protected branches to prevent tampering). + // Separate watermarks for fast-poll (slash commands only) and + // full-poll (all events) to prevent fast polls from advancing + // the watermark past unprocessed label/note events. + initTime := time.Now().Format(time.RFC3339) + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_LAST_POLL_AT_FAST", initTime) + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_LAST_POLL_AT_FULL", initTime) + client.CreateProtectedVariable(ctx, owner, repo, "FULLSEND_LABEL_STATE", "{}") + + // 14. Set up inference WIF (if --inference-project provided) + + // 15. Print CI minute warning for shared runners + if tier == "free" { + log.Warn("Free tier detected. Polling will consume CI minutes on shared runners. " + + "Consider using self-hosted runners. See ADR 0063 for details.") + } +} +``` + +### Tier detection + +```go +func detectGitLabTier(ctx context.Context, client *gitlab.LiveClient, owner, repo string) string { + // Try to create a test pipeline schedule with 5-min interval. + // If it fails with "is too frequent", we're on Free tier. + // This is a heuristic — GitLab doesn't expose the tier via API. + // + // Alternative: check if project access tokens are available + // (Premium+ on gitlab.com, all tiers on self-managed). + // + // For self-managed instances, assume Premium capabilities + // (admins can configure any schedule interval). +} +``` + +### Uninstall flow + +```go +func runGitLabPerRepoUninstall(ctx context.Context, target string, opts uninstallOpts) error { + owner, repo := splitOwnerRepo(target) + + // 1. Delete pipeline schedules + schedules, _ := client.ListPipelineSchedules(ctx, owner, repo) + for _, s := range schedules { + if strings.HasPrefix(s.Description, "fullsend") { + client.DeletePipelineSchedule(ctx, owner, repo, s.ID) + } + } + + // 2. Revoke project access token + // 3. Clean up credential storage (mode-dependent) + // - WIF mode: delete Secret Manager secret, remove WIF attribute condition + // - Variable mode: delete FULLSEND_FORGE_TOKEN CI/CD variable + // 4. Remove CI/CD template files + // 5. Remove CI/CD variables (FULLSEND_LAST_POLL_AT_FAST, FULLSEND_LAST_POLL_AT_FULL, + // FULLSEND_LABEL_STATE, FULLSEND_CREDENTIAL_MODE, FULLSEND_FORGE, + // FULLSEND_PER_REPO_INSTALL) +} +``` + +### Files + +| Action | Path | +|--------|------| +| Modify | `internal/cli/admin.go` — add flags, `runGitLabPerRepoInstall()`, token resolution | +| Modify | `internal/cli/root.go` — add `poll` subcommand | +| Create | `internal/cli/poll.go` — `fullsend poll` command | +| Modify | `internal/config/config.go` — add `Forge` field, validation | + +## Phase 5: Integration and Testing + +### Integration wiring + +- `fullsend run --forge gitlab` constructs a GitLab forge client with bot PAT from `FULLSEND_FORGE_TOKEN` +- `fullsend poll --forge gitlab` runs the polling loop +- Config schema accepts `forge: gitlab` in `config.yaml` +- Forge detection integrated into CLI argument parsing + +### Unit tests + +| Component | Test focus | +|-----------|-----------| +| GitLab forge client | Mock HTTP responses via `httptest.NewServer`. Cover: MR creation, comment posting, label operations. Review synthesis from notes + approvals. Error handling. Subgroup paths. Polling query methods (`ListIssuesUpdatedSince`, etc.). | +| Poller | Event discovery with mock API responses. Slash command detection. Label state diffing. Event routing. Deduplication. Watermark management. Fast-poll vs full-poll modes. | +| Forge detection | GitHub URL → `"github"`. GitLab URL → `"gitlab"`. SSH remote → error. Self-hosted → error with flag suggestion. `--forge` override. | +| CLI | GitLab argument parsing. Per-repo enforcement for GitLab. Token resolution chain. Poll interval selection by tier. | +| Config | `forge: gitlab` validation. Unknown forge rejection. | + +### Integration tests + +Mock GitLab API → poller → child pipeline generation: +1. Poller discovers new issue with `ready-to-code` label → dispatches code stage +2. Poller discovers `/fs-triage` comment → dispatches triage stage +3. Poller discovers MR comment with changes-requested marker (same project) → dispatches fix stage +4. Poller discovers MR comment with changes-requested marker (fork MR) → skips fix stage +5. Poller skips bot-authored comments → no dispatch +6. Poller handles empty poll (no events since last watermark) → no dispatch, watermark advances to current time +7. Poller deduplicates events across overlapping windows → single dispatch +8. Label state tracking: newly-added label triggers dispatch, pre-existing label does not +9. Full install flow with mock GitLab API (no real GitLab instance) + +### E2E tests + +Against GitLab.com: +1. Create a test project +2. Run `fullsend admin install group/project --forge gitlab` +3. Verify pipeline schedule(s) created with correct intervals +4. Create an issue with `/fs-triage` comment +5. Wait for next poll cycle → verify triage pipeline fires and triage agent runs +6. Add `ready-to-code` label to issue +7. Wait for next poll cycle → verify code pipeline fires and code agent creates MR +8. Verify review pipeline fires immediately on MR open (native CI path) +9. Run `fullsend admin uninstall group/project --forge gitlab` +10. Verify cleanup (schedule deleted, project access token revoked, variables deleted) + +Self-hosted testing: Docker-based GitLab CE instance for version compatibility testing. Minimum GitLab version: 17.0+ (stable trigger API, CI/CD variable protection, pipeline schedules). + +### FakeClient updates + +Add implementations to `internal/forge/fake.go` for: +- `IsProtectedBranch` — configurable return value +- `CreatePipelineSchedule` — record call, return fake schedule ID +- `DeletePipelineSchedule` — record call +- `UpdateVariable` — record call + +## Security-Critical Code Paths + +These paths require extra review attention. A bug here is a security vulnerability, not just a functional failure. + +### 1. Pipeline schedule targets protected default branch only + +**File**: `internal/cli/admin.go` (install flow), `.gitlab-ci.yml` (workflow rules) + +The pipeline schedule MUST target the protected default branch. The `workflow:rules` enforce `$CI_COMMIT_REF_PROTECTED == "true"` for scheduled pipelines. + +```yaml +# CORRECT — schedule always targets default branch +workflow: + rules: + - if: $CI_PIPELINE_SOURCE == "schedule" && $CI_COMMIT_REF_PROTECTED == "true" +``` + +**Consequence of bug**: Pipeline runs on a non-protected branch. In WIF mode, WIF attribute conditions (requiring `ref_protected == "true"`) provide defense-in-depth — the OIDC token exchange fails. In variable mode, protected variable status prevents exposure on non-protected branches. + +### 2. Protected variable creation + +**File**: `internal/forge/gitlab/gitlab.go` + +When creating CI/CD variables for secrets, the `Protected` flag MUST be `true`. Protected variables are only exposed to pipelines running on protected branches. + +**Consequence of bug**: Any pipeline (including on MR branches with attacker-modified `.gitlab-ci.yml`) can see credentials. In WIF mode, this exposes WIF configuration (OIDC token replay within ~5 minute TTL). In variable mode, this directly exposes the bot PAT. + +### 3. `CI_DEBUG_TRACE` guard + +**Files**: All CI/CD template YAML files, `internal/cli/admin.go` + +Every stage pipeline must exit early if debug tracing is detected. This prevents credential leakage through verbose job logs. **In variable mode, this guard is the sole defense** — GitLab logs all CI/CD variables at job initialization, before any script runs. In WIF mode, the guard is defense-in-depth — even if bypassed, the PAT is not in a CI/CD variable and is retrieved after the guard runs. + +### 4. Fork MR blocking + +**File**: `internal/poll/events.go` (poller routing), `.gitlab/ci/fullsend-fix.yml` (pipeline template) + +Fork MR protection in three places: +- `isForkMR` helper denies when `source_project_id != target_project_id` +- `isForkMR` also denies when source/target are unknown (zero-valued) — this covers the fast-poll path where MR details are not fetched, ensuring deny-by-default +- Fix pipeline template checks `source_project_id != target_project_id` (defense-in-depth) + +**Consequence of bug**: Fork MR triggers fix/code pipeline that pushes commits to the target project. + +### 5. Slash command authorization + +**File**: `internal/poll/events.go` + +The poller MUST verify that slash command authors have Developer-level (30+) project access before dispatching agent stages. The `hasWriteAccess` method queries the GitLab Members API and caches results per poll cycle. Without this check, Guest/Reporter users could post `/fs-code` or `/fs-fix` commands to trigger agent stages. + +**Exception — needs-info triage**: Comments on issues with the `needs-info` label trigger triage without an authorization check (Guest+ can trigger). This is intentional: the user is providing information that was explicitly requested, and triage is a read-only assessment that does not modify repository contents. This exception applies only to the triage stage — all other agent stages require Developer+ authorization via slash commands. + +**Consequence of bug**: Unauthorized users trigger code generation or fix stages, potentially modifying repository contents. + +### 6. Event payload base64 encoding + +**File**: `internal/poll/dispatch.go` + +Event payloads MUST be base64-encoded before passing as child pipeline variables. + +**Consequence of bug**: YAML injection via issue titles or MR descriptions containing YAML metacharacters. + +### 7. Bot comment filtering + +**File**: `internal/poll/events.go` + +The poller MUST skip bot-authored comments to prevent the agent's own replies from re-triggering agent stages. Exception: bot-authored comments containing `` markers must trigger the fix stage. + +**Consequence of bug**: Infinite loop — agent posts a comment, poller detects it as a new event, dispatches the stage again. + +### 8. Poll state variable protection + +**File**: `internal/poll/state.go` + +Both `FULLSEND_LAST_POLL_AT_FAST`, `FULLSEND_LAST_POLL_AT_FULL`, and `FULLSEND_LABEL_STATE` MUST be protected (created as protected variables during install). Tampering with any requires Maintainer access — the same privilege level as modifying the pipeline. Separate watermarks prevent fast polls (slash commands only) from advancing past unprocessed label/note events that the full poll handles. + +**Consequence of bug**: For the watermark, an attacker could set it far in the future (skipping events) or far in the past (reprocessing old events). Reprocessing is handled by deduplication and `resource_group` concurrency control. Skipping is the higher risk — but requires Maintainer access, which is already within the insider threat model. For the label state, an attacker could clear it so all existing labels re-fire as "new," causing spurious agent stage dispatches. + +## Verification Checklist + +- [ ] `make go-test` — all unit tests pass (existing + new) +- [ ] `make go-vet` — no issues +- [ ] `make lint` — passes +- [ ] Poller unit test covers: event discovery, slash command detection, label state diffing, routing, dedup +- [ ] Poller unit test verifies bot comment filtering (both skip and changes-requested exception) +- [ ] Poller unit test verifies fork MR protection in routing +- [ ] GitLab client unit test asserts `Protected: true` on secret variable creation +- [ ] All stage YAML files contain `CI_DEBUG_TRACE` guard +- [ ] Fix stage YAML contains fork MR protection +- [ ] `workflow:rules` require `$CI_COMMIT_REF_PROTECTED == "true"` for scheduled pipelines +- [ ] Poll watermark variable created as protected during install +- [ ] Event payloads base64-encoded before passing to child pipelines +- [ ] Child pipeline YAML generation produces valid GitLab CI syntax +- [ ] `fullsend admin install --dry-run testgroup/testproject --forge gitlab` shows correct plan +- [ ] `fullsend admin install testgroup --forge gitlab` returns per-repo enforcement error +- [ ] E2E: Install on GitLab.com test project → pipeline schedules created → issue events detected → agent pipelines fire → uninstall cleans up diff --git a/docs/problems/gitlab-implementation.md b/docs/problems/gitlab-implementation.md index a92083df3..ff4bbf679 100644 --- a/docs/problems/gitlab-implementation.md +++ b/docs/problems/gitlab-implementation.md @@ -1,5 +1,13 @@ # GitLab Support Implementation Details +> **Note:** The webhook-based dispatch approach described in this document is +> superseded by [ADR 0063](../ADRs/0063-gitlab-cron-polling-event-dispatch.md) +> (cron-polling event dispatch), which eliminates webhooks entirely. For the +> current implementation approach, see +> [docs/plans/gitlab-cron-polling-implementation.md](../plans/gitlab-cron-polling-implementation.md). +> The sections below on CI/CD pipeline mapping, PAT-based auth, and forge +> interface evolution remain valid reference material. + This document contains implementation details for GitLab support in fullsend. For the architectural decision and rationale, see [ADR-0028](../ADRs/0028-gitlab-support.md) (status: Deprecated — CI/CD pipeline mapping, PAT-based auth, and webhook bridging sections remain valid reference material; harness-level forge abstraction is now covered by [ADR-0045](../ADRs/0045-forge-portable-harness-schema.md)). ## Table of Contents @@ -36,7 +44,7 @@ This document contains implementation details for GitLab support in fullsend. Fo 2. **GitLab serverless functions**: Use GitLab's serverless integration to deploy a function that receives webhooks and translates to trigger API calls. Maintains compute-platform agnosticism (runs within GitLab infrastructure) but requires GitLab Premium/Ultimate tier. 3. **Minimal bridge service**: Deploy a lightweight translation service (e.g., Cloud Run, Lambda) that receives webhooks and POSTs to the trigger API. This reintroduces the "hosted webhook receiver" concern from ADR-0009 but may be acceptable given GitLab's lack of a direct webhook-to-pipeline primitive. -**Open question**: The webhook-to-trigger translation requirement creates an architectural tension. Options 2 and 3 both introduce additional infrastructure (serverless functions or hosted bridge), while option 1 reintroduces the security concern that webhooks were meant to solve. For GitLab Free tier deployments, option 3 (minimal bridge) is likely the only viable path. For Premium/Ultimate, option 2 (serverless) keeps compute within GitLab infrastructure. See ADR-0028 "Open Questions" for full analysis. +**Open question**: The webhook-to-trigger translation requirement creates an architectural tension. Options 2 and 3 both introduce additional infrastructure (serverless functions or hosted bridge), while option 1 reintroduces the security concern that webhooks were meant to solve. For GitLab Free tier deployments, option 3 (minimal bridge) is likely the only viable path. For Premium/Ultimate, option 2 (serverless) keeps compute within GitLab infrastructure. See ADR-0028 "Open Questions" for full analysis. **Decided:** [ADR 0063](../ADRs/0063-gitlab-cron-polling-event-dispatch.md) eliminates webhooks entirely — cron-based polling via scheduled GitLab CI/CD pipelines replaces the webhook bridge, removing the need for any translation intermediary. **Security requirements for webhook translation intermediary**: diff --git a/docs/roadmap.md b/docs/roadmap.md index 9f85c06af..beb952cfb 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -79,13 +79,13 @@ Examples of work that could move this forward: ### Forge portability -GitHub is the starting point, not the boundary. GitLab support requires solving webhook-to-pipeline translation, MR-event security models, and forge interface abstraction. This work continues incrementally alongside higher-priority items. +GitHub is the starting point, not the boundary. GitLab support uses a two-path model: native CI triggers (`merge_request_event`) for MR events and cron-based polling via scheduled pipelines for issues, comments, and labels — no external infrastructure required. Forge interface abstraction and MR-event security models continue incrementally alongside higher-priority items. See [ADR 0063](ADRs/0063-gitlab-cron-polling-event-dispatch.md). Related: [gitlab-implementation](problems/gitlab-implementation.md) Examples of work that could move this forward: -- GitLab webhook bridge ([#1964](https://github.com/fullsend-ai/fullsend/issues/1964), [#1816](https://github.com/fullsend-ai/fullsend/pull/1816)) +- ~~GitLab webhook bridge~~ — superseded by cron-polling ([ADR 0063](ADRs/0063-gitlab-cron-polling-event-dispatch.md), [#1964](https://github.com/fullsend-ai/fullsend/issues/1964)) - Forge-portable harness schema ([#1605](https://github.com/fullsend-ai/fullsend/issues/1605), [#1848](https://github.com/fullsend-ai/fullsend/pull/1848)) ### Feature refinement