Skip to content

feat(repos): add manifest parser and validation for repos.yaml#3002

Open
ggallen wants to merge 1 commit into
fullsend-ai:mainfrom
ggallen:feat/repos-manifest-parser
Open

feat(repos): add manifest parser and validation for repos.yaml#3002
ggallen wants to merge 1 commit into
fullsend-ai:mainfrom
ggallen:feat/repos-manifest-parser

Conversation

@ggallen

@ggallen ggallen commented Jul 4, 2026

Copy link
Copy Markdown
Member

Summary

  • Adds internal/repos/manifest.go with full manifest schema types (Manifest, MintConfig, DefaultsConfig, RepoEntry, NullableString, ResolvedConfig)
  • LoadManifest supports local file and HTTPS URL sources
  • Validate enforces schema rules (version, mint URL, repo format, duplicates, glob patterns)
  • ExpandGlobs resolves wildcard repo patterns via forge API
  • ResolveConfig implements three-level field resolution (per-repo override > defaults > built-in)
  • NullableString distinguishes omitted/null/set YAML states for correct fallback chain behavior
  • Foundation for all fullsend repos subcommands (ADR 0057, PR 2 of 8)

Test plan

  • Unit tests covering all parser paths, validation rules, and resolution logic
  • Glob expansion tested with fake forge client
  • URL loading tested with httptest server
  • Coverage >85% on new code (92.3% achieved)

🤖 Generated with Claude Code

@ggallen ggallen requested a review from a team as a code owner July 4, 2026 00:00
@fullsend-ai-review

fullsend-ai-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 12:01 AM UTC · Completed 12:14 AM UTC
Commit: e62477c · View workflow run →

@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Add repos.yaml manifest parsing, validation, and glob expansion

✨ Enhancement 🧪 Tests 🕐 40+ Minutes

Grey Divider

AI Description

• Introduces repos.yaml schema types plus load/validate/marshal support.
• Adds glob expansion via forge API and deterministic resolved repo ordering.
• Implements per-repo/default/built-in config resolution with explicit-null overrides.
Diagram

graph TD
  S["Manifest source (file/URL)"] --> L(["LoadManifest"]) --> Y["YAML unmarshal"] --> M["Manifest"] --> V(["Validate"])
  M --> G(["ExpandGlobs"]) --> F{{"Forge API client"}}
  M --> R(["ResolveConfig"]) --> C["ResolvedConfig"]

  subgraph Legend
    direction LR
    _doc["Input/Output"] ~~~ _proc(["Function"]) ~~~ _ext{{"External system"}}
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Make LoadManifest context-aware and HTTPS-only
  • ➕ Allows cancellation/timeouts and avoids leaking long-lived HTTP requests
  • ➕ Enforces the documented security posture (HTTPS) at the boundary
  • ➖ Requires signature changes (ctx, optional injected http.Client) and updating call sites
  • ➖ Slightly more plumbing for a foundational PR
2. Use sort.Slice instead of manual O(n²) sorting
  • ➕ Simpler and idiomatic; less error-prone for larger repo lists
  • ➖ Low impact given typical manifest sizes; current behavior is deterministic
3. Model nullable overrides with *string + custom decoding wrapper
  • ➕ Pointer semantics can feel more idiomatic than custom tri-state types
  • ➖ Harder to preserve omitted vs explicit null behavior cleanly with yaml.v3
  • ➖ Tri-state NullableString is explicit and already well-tested here

Recommendation: The overall approach (explicit schema types + explicit tri-state override semantics + dedicated Validate/ExpandGlobs/ResolveConfig steps) is appropriate for a manifest that will drive multiple subcommands. Consider a follow-up to make LoadManifest accept a context and restrict to HTTPS at the I/O boundary; this would align the implementation with security and operability expectations without complicating this foundational PR.

Files changed (2) +1299 / -0

Enhancement (1) +416 / -0
manifest.goAdd repos.yaml schema, loader, validation, glob expansion, and resolution +416/-0

Add repos.yaml schema, loader, validation, glob expansion, and resolution

• Introduces the manifest schema types (Manifest, MintConfig, DefaultsConfig, RepoEntry) and implements LoadManifest (file/URL), Validate, ExpandGlobs (forge-backed wildcard resolution), ResolveConfig (override/default merge), and Marshal. Adds NullableString plus custom RepoEntry unmarshalling to correctly distinguish omitted vs explicit null vs explicit string values.

internal/repos/manifest.go

Tests (1) +883 / -0
manifest_test.goComprehensive unit tests for manifest parsing, validation, and resolution +883/-0

Comprehensive unit tests for manifest parsing, validation, and resolution

• Adds unit tests covering YAML parsing (string and object repo entries), NullableString edge cases, manifest validation failures, HTTP/file loading behavior, glob expansion behavior with a fake forge client, and config resolution semantics including explicit null overrides and marshal round-trips.

internal/repos/manifest_test.go

@github-actions

github-actions Bot commented Jul 4, 2026

Copy link
Copy Markdown

Site preview

Preview: https://d7a1714f-site.fullsend-ai.workers.dev

Commit: 49fe9cd0b53632eeb5ca4abd4292b850c5be140a

@qodo-code-review

qodo-code-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📜 Skill insights (0)

Context used
✅ Compliance rules (platform): 54 rules

Grey Divider


Action required

1. Glob config not resolved ✓ Resolved 🐞 Bug ≡ Correctness
Description
ResolveConfig only selects a RepoEntry whose Repo exactly equals owner/repo, so repositories
produced by ExpandGlobs never apply the glob entry’s overrides and instead fall back to manifest
defaults. This yields incorrect resolved settings (e.g., inference_project/fullsend_ref) for any
glob-matched repo.
Code

internal/repos/manifest.go[R372-396]

+func (m *Manifest) ResolveConfig(owner, repo string) ResolvedConfig {
+	fullName := owner + "/" + repo
+
+	// Find the matching entry.
+	var entry RepoEntry
+	for _, e := range m.Repos {
+		if e.Repo == fullName {
+			entry = e
+			break
+		}
+	}
+
+	return ResolvedConfig{
+		Owner:                  owner,
+		Repo:                   repo,
+		MintURL:                m.Mint.URL,
+		MintProject:            m.Mint.Project,
+		MintRegion:             m.Mint.Region,
+		InferenceProject:       resolveField(entry.InferenceProject, m.Defaults.InferenceProject, ""),
+		InferenceRegion:        resolveField(entry.InferenceRegion, m.Defaults.InferenceRegion, ""),
+		FullsendRef:            resolveField(entry.FullsendRef, m.Defaults.FullsendRef, ""),
+		BaseHarness:            resolveField(entry.BaseHarness, m.Defaults.BaseHarness, ""),
+		AllowedRemoteResources: m.Defaults.AllowedRemoteResources,
+	}
+}
Relevance

⭐⭐⭐ High

Team usually fixes manifest/config resolution correctness bugs (e.g., three-state semantics/guard
handling) when found.

PR-#967

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
ExpandGlobs creates concrete per-repo entries for glob matches, but ResolveConfig ignores those
results and only scans the original manifest entries, which contain the pattern (e.g.,
org/service-*), not the expanded repo name.

internal/repos/manifest.go[262-342]
internal/repos/manifest.go[364-396]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`ResolveConfig(owner, repo)` searches only `m.Repos` for an exact `owner/repo` entry. When a repo is included via a glob (expanded by `ExpandGlobs`), the matching overrides live in the returned `ResolvedRepo.Entry`, not in `m.Repos`, so `ResolveConfig` cannot see them.

### Issue Context
`ExpandGlobs` constructs a concrete `RepoEntry` for each match by copying the glob entry and setting `entry.Repo = fullName`, but this is returned only in `[]ResolvedRepo`.

### Fix Focus Areas
- internal/repos/manifest.go[262-342]
- internal/repos/manifest.go[364-396]

### Suggested fix direction
- Option A: Add `ResolveConfigForEntry(owner, repo string, entry RepoEntry) ResolvedConfig` (or `ResolveConfigFromResolved(rr ResolvedRepo) ResolvedConfig`) and ensure callers pass the `ResolvedRepo.Entry` from `ExpandGlobs`.
- Option B: Change `ExpandGlobs` to return an updated manifest (or a map `fullName -> RepoEntry`) that `ResolveConfig` consults.
- Add a unit test covering `glob override -> resolved config` to prevent regressions.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Unbounded manifest URL fetch ✓ Resolved 🐞 Bug ⛨ Security
Description
LoadManifest fetches remote manifests using http.Get with no context/timeout and reads the body via
io.ReadAll with no size limit, so a slow or large response can hang or exhaust memory. It also
accepts http:// URLs despite the comment stating HTTPS.
Code

internal/repos/manifest.go[R173-192]

+// LoadManifest reads and parses a repos.yaml manifest from a local
+// file path or an HTTPS URL.
+func LoadManifest(pathOrURL string) (*Manifest, error) {
+	var data []byte
+	var err error
+
+	if strings.HasPrefix(pathOrURL, "https://") || strings.HasPrefix(pathOrURL, "http://") {
+		resp, httpErr := http.Get(pathOrURL) //nolint:gosec,noctx // URL is validated by caller
+		if httpErr != nil {
+			return nil, fmt.Errorf("fetching manifest from %s: %w", pathOrURL, httpErr)
+		}
+		defer resp.Body.Close()
+		if resp.StatusCode != http.StatusOK {
+			return nil, fmt.Errorf("fetching manifest from %s: HTTP %d", pathOrURL, resp.StatusCode)
+		}
+		data, err = io.ReadAll(resp.Body)
+		if err != nil {
+			return nil, fmt.Errorf("reading manifest body from %s: %w", pathOrURL, err)
+		}
+	} else {
Relevance

⭐⭐⭐ High

Team previously accepted adding HTTP timeouts and bounded reads; internal/fetch enforces these
patterns.

PR-#1690
PR-#1612
PR-#1096

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
LoadManifest currently does a raw http.Get + unbounded io.ReadAll. Elsewhere in the repo,
remote fetches go through fetch.FetchURL, which uses request context, enforces timeouts, blocks
redirects, and applies a size limit via io.LimitReader.

internal/repos/manifest.go[173-192]
internal/fetch/fetch.go[174-207]
internal/cli/agent.go[381-386]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`LoadManifest` uses `http.Get` (default client) and `io.ReadAll` for URL sources. This makes remote manifest loading non-cancellable, potentially unbounded in time, and unbounded in memory.

### Issue Context
The repo already contains `internal/fetch.FetchURL`, which implements timeouts, redirect blocking, and response size limits.

### Fix Focus Areas
- internal/repos/manifest.go[173-204]
- internal/fetch/fetch.go[174-207]

### Suggested fix direction
- Prefer adding `LoadManifest(ctx context.Context, pathOrURL string, policy fetch.FetchPolicy)` (or at least `LoadManifestWithClient(ctx, pathOrURL string, httpClient *http.Client, maxBytes int64)`).
- Enforce `https://` only (unless you explicitly allow `http://localhost` for tests), and use `fetch.FetchURL` (or replicate its core guarantees: timeout + size cap + context).
- Replace `io.ReadAll(resp.Body)` with `io.ReadAll(io.LimitReader(resp.Body, maxBytes+1))` and error on overflow.
- Add a test asserting oversized/slow responses are rejected/cancelled.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Owner wildcard passes validation ✓ Resolved 🐞 Bug ≡ Correctness
Description
Validate only checks glob metacharacters in the repo segment (parts[1]) and allows them in the
owner/org segment, but ExpandGlobs passes parts[0] directly to forge.Client.ListOrgRepos. Manifests
like '*/service-*' can pass validation and then fail at runtime due to invalid org API calls.
Code

internal/repos/manifest.go[R240-307]

+		parts := strings.SplitN(entry.Repo, "/", 2)
+		if len(parts) != 2 || parts[0] == "" || parts[1] == "" {
+			return fmt.Errorf("repos[%d]: invalid repo format %q (expected owner/repo)", i, entry.Repo)
+		}
+
+		// Validate glob patterns.
+		if strings.ContainsAny(parts[1], "*?[") {
+			if _, err := filepath.Match(parts[1], "test"); err != nil {
+				return fmt.Errorf("repos[%d]: invalid glob pattern %q: %w", i, entry.Repo, err)
+			}
+		}
+
+		// Check for duplicates.
+		if seen[entry.Repo] {
+			return fmt.Errorf("repos[%d]: duplicate repo %q", i, entry.Repo)
+		}
+		seen[entry.Repo] = true
+	}
+
+	return nil
+}
+
+// ExpandGlobs resolves wildcard repo entries by listing org repos
+// via the forge API. Explicit entries always win over glob-matched
+// entries. The returned list is deduplicated and sorted.
+func (m *Manifest) ExpandGlobs(ctx context.Context, client forge.Client) ([]ResolvedRepo, error) {
+	// First pass: separate explicit entries from glob patterns.
+	explicit := make(map[string]RepoEntry)
+	type globEntry struct {
+		org     string
+		pattern string
+		entry   RepoEntry
+	}
+	var globs []globEntry
+
+	for _, entry := range m.Repos {
+		parts := strings.SplitN(entry.Repo, "/", 2)
+		org := parts[0]
+		name := parts[1]
+
+		if strings.ContainsAny(name, "*?[") {
+			globs = append(globs, globEntry{org: org, pattern: name, entry: entry})
+		} else {
+			explicit[entry.Repo] = entry
+		}
+	}
+
+	// Second pass: expand globs.
+	resolved := make(map[string]ResolvedRepo)
+
+	// Add explicit entries first (they take priority).
+	for fullName, entry := range explicit {
+		parts := strings.SplitN(fullName, "/", 2)
+		resolved[fullName] = ResolvedRepo{
+			Owner: parts[0],
+			Repo:  parts[1],
+			Entry: entry,
+		}
+	}
+
+	// Expand each glob pattern.
+	orgRepoCache := make(map[string][]forge.Repository)
+	for _, g := range globs {
+		repos, ok := orgRepoCache[g.org]
+		if !ok {
+			var err error
+			repos, err = client.ListOrgRepos(ctx, g.org)
+			if err != nil {
Relevance

⭐⭐⭐ High

Repo/owner format validation is commonly tightened for defense-in-depth; similar repo-format
validation suggestions were accepted.

PR-#390
PR-#2277

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Validate inspects glob metacharacters only in parts[1], while ExpandGlobs uses parts[0] as
org for the forge API call. The codebase already contains org-name validation logic that could be
reused to prevent invalid expansion calls.

internal/repos/manifest.go[233-257]
internal/repos/manifest.go[275-307]
internal/cli/admin.go[86-115]
internal/forge/forge.go[206-226]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The manifest grammar appears to intend `owner/<repo-or-glob>`, but `Validate` doesn’t reject glob characters in the owner segment. `ExpandGlobs` then treats that segment as a concrete org name and calls `ListOrgRepos`.

### Issue Context
The CLI already has a GitHub org validation helper (`validateOrgName`) and owner/repo regex patterns.

### Fix Focus Areas
- internal/repos/manifest.go[233-257]
- internal/repos/manifest.go[275-307]
- internal/cli/admin.go[86-115]

### Suggested fix direction
- In `Validate`, reject `*?[` in `parts[0]` (owner) and/or validate it using a shared helper/pattern (e.g., reuse `validateOrgName` or `githubOwnerPattern`).
- Optionally validate the non-glob repo name segment against the repo naming rules used elsewhere.
- Add a unit test ensuring `repos: ["*/service-*"]` fails validation with a clear error.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. NullableString reuse retains Null ✓ Resolved 🐞 Bug ☼ Reliability
Description
decodeNullable and NullableString.UnmarshalYAML set Null=true for !!null but never clear Null when
decoding a non-null scalar, so unmarshalling into a non-zero struct can retain stale Null=true and
force resolveField to return "". This makes the fallback chain stateful across decodes.
Code

internal/repos/manifest.go[R98-129]

+func decodeNullable(node *yaml.Node, ns *NullableString) error {
+	if node.Tag == "!!null" {
+		ns.Set = true
+		ns.Null = true
+		return nil
+	}
+	ns.Set = true
+	return node.Decode(&ns.Value)
+}
+
+// NullableString distinguishes three YAML states: omitted (zero value),
+// explicit null (Set=true, Null=true), and an explicit string value
+// (Set=true, Null=false, Value holds the string). This three-state
+// design lets per-repo overrides explicitly clear a default with
+// "field: null" rather than inheriting it.
+type NullableString struct {
+	Value string
+	Set   bool
+	Null  bool
+}
+
+// UnmarshalYAML decodes a YAML scalar into a NullableString, treating
+// the !!null tag as an explicit null.
+func (n *NullableString) UnmarshalYAML(node *yaml.Node) error {
+	if node.Tag == "!!null" {
+		n.Set = true
+		n.Null = true
+		return nil
+	}
+	n.Set = true
+	return node.Decode(&n.Value)
+}
Relevance

⭐⭐ Medium

No close precedent; team adds YAML edge-case tests, but no history on resetting custom Unmarshal
state.

PR-#2129

ⓘ Recommendations generated based on similar findings in past PRs

Evidence
Both decoding helpers only assign Null in the !!null branch; on non-null nodes they set Set=true
and decode Value but never clear a previously true Null. Because resolveField treats Null as
authoritative, stale state changes resolution outcomes.

internal/repos/manifest.go[98-106]
internal/repos/manifest.go[121-129]
internal/repos/manifest.go[400-406]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`NullableString` decoding is stateful: once `Null` becomes true, decoding a subsequent non-null value does not clear `Null` back to false. The same applies to `decodeNullable`.

### Issue Context
This typically won’t show up on the first parse (zero-values), but it can break callers that reuse the same `Manifest`/`RepoEntry`/`NullableString` structs across multiple unmarshals.

### Fix Focus Areas
- internal/repos/manifest.go[98-106]
- internal/repos/manifest.go[121-129]
- internal/repos/manifest.go[400-406]

### Suggested fix direction
- In both `decodeNullable` and `NullableString.UnmarshalYAML`, ensure the non-null path explicitly sets `Null = false` (and consider clearing `Value` before decode).
- Consider resetting `RepoEntry` at the start of `RepoEntry.UnmarshalYAML` if you want full robustness for reuse.
- Add a small unit test that unmarshals `field: null` then unmarshals `field: value` into the same struct and asserts `Null` becomes false.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread internal/repos/manifest.go Outdated
Comment thread internal/repos/manifest.go
Comment thread internal/repos/manifest.go
Comment thread internal/repos/manifest.go
@codecov

codecov Bot commented Jul 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.80567% with 40 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/repos/manifest.go 84.29% 25 Missing and 13 partials ⚠️
internal/forge/fake.go 60.00% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@fullsend-ai-review

fullsend-ai-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

Review

Findings

Low

  • [error-handling] internal/repos/manifest.go:252 — TOCTOU race in LoadManifest for local files: os.Stat checks file size, then os.ReadFile reads. Between the two calls, the file could be replaced with a larger one, bypassing the maxManifestBytes guard. Practical risk is negligible since this is a CLI tool reading local config.

  • [edge-case] internal/forge/fake.go:240 — When OrgRepos is set but does not contain a key for the requested org, source will be nil, causing ListOrgRepos to return an empty list with no error. This differs subtly from pre-existing behavior where f.Repos is always iterated. Tests use this correctly.

  • [edge-case] internal/repos/manifest.go:294safeDialContext resolves all IPs for a host but only connects to ips[0]. If the first IP is unreachable, the connection fails even though other safe IPs exist. Safe from SSRF perspective since all IPs are checked.

  • [SSRF] internal/repos/manifest.go:302fetchManifestURL accepts a skipIPCheck parameter that completely bypasses SSRF IP-address validation when true. The only production call site passes false, and true is only used in tests. The function is unexported, limiting misuse surface.

  • [path-traversal] internal/repos/manifest.go:250LoadManifest passes pathOrURL directly to os.Stat and os.ReadFile without path sanitization. The code documents the caller contract ("Callers must ensure the path is safe"). No production callers exist yet.

Previous run

Review

Findings

Low

  • [missing-authorization] — No linked GitHub issue. This PR adds 1,602 lines of new implementation without a linked issue. The PR description references ADR 0057 and claims to be "PR 2 of 8" from the implementation plan, which provides implicit scope authorization.

  • [edge-case] internal/repos/manifest.go:117RepoEntry.UnmarshalYAML silently ignores unknown YAML keys in the mapping node (the switch has no default case). An unrecognized field like inferencce_project (typo) would be silently dropped with no warning, which could lead to user confusion when an override appears to have no effect.

  • [SSRF] internal/repos/manifest.go:258fetchManifestURL accepts a skipIPCheck parameter that completely bypasses SSRF IP-address validation when true. The only production call site passes false, and true is only used in tests. The function is unexported, limiting misuse surface.

  • [path-traversal] internal/repos/manifest.go:206LoadManifest passes pathOrURL directly to os.Stat and os.ReadFile without path sanitization. The code documents the caller contract ("Callers must ensure the path is safe"). When production callers are added, they should validate paths.

  • [error-handling] internal/repos/manifest.go:208 — TOCTOU race in LoadManifest for local files: os.Stat checks file size, then os.ReadFile reads. Between the two calls, the file could be replaced with a larger one, bypassing the maxManifestBytes guard. Practical risk is negligible since this is a CLI tool reading local config.

Previous run (2)

Review

Findings

Medium

  • [architectural-alignment] internal/repos/manifest.go:415ExpandGlobs delegates to forge.Client.ListOrgRepos, which excludes private, archived, and forked repos (confirmed via the FakeClient implementation filtering at fake.go:242). This filtering contract is not documented in the ExpandGlobs godoc. Callers cannot discover this limitation without reading the forge implementation. The implementation plan (docs/plans/repos-management.md) identifies this as a known limitation requiring a new forge method in a subsequent PR, but the code itself does not surface this.
    Remediation: Add a godoc comment to ExpandGlobs documenting that ListOrgRepos excludes private, archived, and forked repos, and that private repos must be listed as explicit entries until the forge interface is extended.

Low

  • [missing-authorization] No linked issue — This PR adds 1592 lines of new implementation without a linked GitHub issue. The PR description references ADR 0057 and claims to be "PR 2 of 8" from the implementation plan, which provides implicit scope authorization, but a tracking issue would improve traceability.

  • [error-handling] internal/repos/manifest.go:250 — TOCTOU race in LoadManifest for local files: os.Stat checks the file size at line 250, then os.ReadFile reads at line 257. Between the two calls, the file could be replaced with a larger one, bypassing the maxManifestBytes guard. Practical risk is negligible since this is a CLI tool reading local config.

  • [edge-case] internal/repos/manifest.go:566resolveField checks override.Null (line 566) before override.Set (line 569). The invalid state NullableString{Set: false, Null: true} would stop the fallback chain and return empty string rather than falling through to defaults. All deserialization paths maintain the invariant that Null implies Set, so this only affects hand-constructed values.

  • [edge-case] internal/repos/manifest.go:194MarshalYAML checks !n.Set before n.Null, so the invalid state NullableString{Set: false, Null: true} would serialize as omitted (nil), while resolveField would treat it as null (stop chain). Inconsistent handling of the same invalid state across the two functions.

  • [SSRF] internal/repos/manifest.go:300fetchManifestURL accepts a skipIPCheck parameter that completely bypasses SSRF IP-address validation when true. The only production call site (LoadManifest, line 241) passes false, and true is only used in tests. The function is unexported, limiting misuse surface, but the boolean toggle creates a latent risk if future callers within the package misuse it.

  • [path-traversal] internal/repos/manifest.go:248LoadManifest passes pathOrURL directly to os.Stat and os.ReadFile without path sanitization. The code documents the caller contract ("Callers must ensure the path is safe"), and there are no callers yet outside tests.

  • [error-message-format] internal/repos/manifest.go:313"too many redirects" does not include the limit value. The limit is 3 (line 312). Including it (e.g., "exceeded redirect limit (3)") would improve debuggability.

  • [error-wrapping-consistency] internal/repos/manifest.go:246 — Error message "insecure http:// not supported; use https://" uses semicolon before the suggestion. The codebase uses both semicolon and colon patterns, so this is a minor inconsistency.

  • [comment-placement] internal/repos/manifest.go:162 — The NullableString type doc comment is thorough but verbose (10 lines including the fourth-state explanation). Consider tightening the doc to focus on the three intentional states and moving the fourth-state note to resolveField's doc.

  • [exported-function-doc] internal/repos/manifest.go:507 — The ResolveConfig doc comment explains the glob limitation in its last paragraph. Consider leading with a brief usage note distinguishing ResolveConfig (exact match) from ResolveConfigForEntry (glob-expanded repos) for faster discoverability.

Previous run (3)

Review

Findings

Medium

  • [nil/null-handling] internal/repos/manifest.go:247safeDialContext accesses ips[0] without checking that the ips slice returned by LookupIPAddr is non-empty. While Go's resolver typically returns an error for unresolvable hosts, a zero-length slice with nil error is not contractually ruled out. This would cause a panic (index out of range).
    Remediation: Add a guard before accessing ips[0]: if len(ips) == 0 { return nil, fmt.Errorf("no addresses found for %q", host) }.

  • [architectural-alignment] internal/repos/manifest.goExpandGlobs delegates to forge.Client.ListOrgRepos, which excludes private, archived, and forked repos. The implementation plan (docs/plans/repos-management.md) assigns the forge interface extension for private repo inclusion to PR 3. This is by design for PR 2's scope but worth noting as a known limitation.

Low

  • [api-contract] internal/repos/manifest.go:478ResolveConfig searches m.Repos for an exact match on owner/repo, but after ExpandGlobs the manifest's Repos slice still contains the original glob patterns. The found bool return value and doc comment mitigate this, but callers who miss the documentation could silently get only defaults.

  • [error-handling] internal/repos/manifest.go:208LoadManifest enforces maxManifestBytes for local files via os.Stat followed by os.ReadFile, but there is a TOCTOU (time-of-check-time-of-use) race between the two calls. The practical risk is negligible since manifest files are not typically modified concurrently.

  • [edge-case] internal/repos/manifest.go:151MarshalYAML checks n.Null before n.Set, so the invalid state NullableString{Set: false, Null: true} would serialize as YAML null rather than being omitted. All current code paths maintain the invariant that Null implies Set, but the marshal method does not enforce it.

  • [path-traversal] internal/repos/manifest.go:206 — The file path branch of LoadManifest passes pathOrURL directly to os.Stat and os.ReadFile without sanitization. The code documents the contract ("Callers must ensure the path is safe"). Since this is a new internal library with no callers yet, the risk is deferred to future caller implementations.

  • [scope-creep] internal/forge/fake.go — The PR modifies FakeClient to add an OrgRepos field and per-org lookup logic in ListOrgRepos. This is a minimal test infrastructure change (one field, ~5 lines of logic) required for TestExpandGlobs_MultiOrg and does not modify the forge.Client interface.

  • [naming-consistency] PR title uses feat prefix, but this PR adds internal infrastructure with no user-facing functionality until later PRs wire it into the CLI. The build or chore prefix may be more accurate, though feat is defensible if the manifest parser is considered the feature itself.

Previous run (4)

Review

Findings

Medium

  • [SSRF] internal/repos/manifest.go:220fetchManifestURL does not restrict the destination host or IP address. The CheckRedirect callback (line 230) limits redirect count to 3 but does not validate the redirect target scheme or host, so an initial https:// URL can redirect to http://169.254.169.254/latest/meta-data/. The Transport uses a plain net.Dialer without IP filtering, and inherits environment proxy settings (HTTP_PROXY). Exploitability depends on attacker control of the manifest URL configuration.
    Remediation: Add a custom DialContext that resolves the target hostname and rejects connections to loopback (127.0.0.0/8, ::1), link-local (169.254.0.0/16, fe80::/10), and private (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) IP ranges. In CheckRedirect, verify each redirect target uses the https scheme. Set Transport.Proxy to nil to disable environment proxy variables.

Low

  • [api-contract] internal/repos/manifest.go:436ResolveConfig searches m.Repos for an exact match on owner/repo, but after ExpandGlobs the manifest's Repos slice still contains the original glob patterns. Callers who expand globs and then call ResolveConfig instead of ResolveConfigForEntry will silently get only defaults with no indication that the lookup failed. The doc comment documents this pitfall, but the API does not signal a miss.
    Remediation: Consider returning (ResolvedConfig, bool) so callers can detect a miss.

  • [error-handling] internal/repos/manifest.go:205LoadManifest does not enforce maxManifestBytes when reading local files via os.ReadFile, unlike the HTTPS code path which enforces a 1 MB limit via LimitReader. A very large local file would be loaded entirely into memory.

  • [path-traversal] internal/repos/manifest.go:205 — The file path branch of LoadManifest passes pathOrURL directly to os.ReadFile without sanitization. If a future caller passes user-controlled input, any readable path will be read. Currently no callers outside this package and the test file.

Previous run (5)

Review

Findings

Medium

  • [logic-error] internal/repos/manifest.go:410resolveField treats an explicitly set empty string (Set=true, Value="") identically to an omitted field, falling through to the fallback. This contradicts the three-state NullableString design: YAML inference_project: "" parses as {Set: true, Null: false, Value: ""} but resolveField ignores it because of the override.Value != "" check. The test at line 1249 documents this as intentional behavior, but it creates a confusing fourth semantic state that is not clearly communicated to callers.
    Remediation: Change the condition from if override.Set && override.Value != "" to if override.Set to respect explicitly-set empty strings, or document this fourth semantic state clearly in the NullableString and resolveField doc comments.

  • [api-contract] internal/repos/manifest.go:378ResolveConfig searches m.Repos for an exact match on owner/repo, but after ExpandGlobs the manifest's Repos slice still contains glob patterns (e.g., acme/service-*). Calling ResolveConfig("acme", "service-api") after glob expansion fails to find a match and silently returns only defaults, discarding per-glob overrides. Since this is PR 2 of 8, downstream callers built in later PRs could hit this API gap.
    Remediation: Either have ResolveConfig accept a ResolvedRepo/RepoEntry directly, or document that ResolveConfig must not be called for glob-matched repos, or have ExpandGlobs mutate m.Repos to contain expanded entries.

  • [input-validation] internal/repos/manifest.go:185LoadManifest accepts http:// URLs in addition to https://, but the function's doc comment says "HTTPS URL" and Validate enforces HTTPS for mint.url. This inconsistency allows insecure manifest fetching.
    Remediation: Remove http:// from the prefix check or add explicit rejection: if strings.HasPrefix(pathOrURL, "http://") { return nil, fmt.Errorf("insecure http:// not supported; use https://") }.

  • [SSRF] internal/repos/manifest.go:186LoadManifest() performs http.Get with no timeout, no response size limit, and no host restriction. In a CLI context the attack surface is limited to the local operator, but defense-in-depth hardening is appropriate for foundation code.
    Remediation: Set an explicit timeout on the HTTP client and limit response body size with io.LimitReader.

  • [resource-exhaustion] internal/repos/manifest.go:194io.ReadAll(resp.Body) reads the entire HTTP response into memory without any size limit. A misconfigured or compromised server could cause OOM.
    Remediation: Replace with io.ReadAll(io.LimitReader(resp.Body, maxManifestSize)) where maxManifestSize is a reasonable constant (e.g., 1 MB).

Low

  • [sorting-implementation] internal/repos/manifest.go:357 — Uses nested-loop bubble sort O(n2) instead of sort.Slice from the standard library. The codebase uses sort.Strings and sort.Slice elsewhere.

  • [error-handling-pattern] internal/repos/manifest.go:314 — Error wrapping "listing repos for org %q: %w" could include the glob pattern being expanded for better debugging context.

  • [error-message-format] internal/repos/manifest.go:248 — Error message uses parenthetical format "invalid repo format %q (expected owner/repo)" while the codebase pattern uses "must be in owner/repo format" style (e.g., internal/cli/run.go, internal/cli/admin.go).

  • [test-inadequate] internal/repos/manifest_test.go:1263TestExpandGlobs_MultiOrg doesn't meaningfully test multi-org behavior because FakeClient.ListOrgRepos ignores the org parameter. Assertions use assert.True(t, len(resolved) > 0) which passes even if the function returns completely wrong results.

  • [comment-style] internal/repos/manifest.go:186 — nolint directive says "URL is validated by caller" but Validate() is a separate public API not guaranteed to be called before LoadManifest. More accurate: // URL scheme is checked above.

  • [SSRF-redirect] internal/repos/manifest.go:186http.Get uses the default HTTP client which follows redirects automatically, potentially bypassing any pre-request URL validation.

  • [path-traversal] internal/repos/manifest.go:199 — File path passed directly to os.ReadFile with no sanitization. In CLI context this is expected behavior (user controls the path), but the caller contract should be documented.

  • [insecure-deserialization] internal/repos/manifest.go:206yaml.Unmarshal on untrusted input from a remote URL. Go's yaml.v3 is safe against code execution but crafted YAML with deeply nested structures could cause excessive memory/CPU usage. Defense-in-depth: pair with the response size limit.

  • [method-documentation] internal/repos/manifest.go:270ExpandGlobs godoc already mentions "via the forge API" but could note network access requirement more explicitly.

  • [test-organization] internal/repos/manifest_test.go:445 — Package-level test constant validManifest differs from the codebase pattern of inlining YAML in individual tests.


Labels: New Go package implementing repos manifest feature (ADR 0057).

Previous run (6)

Review

Findings

Medium

  • [nil/null-handling] internal/repos/manifest.go:247safeDialContext accesses ips[0] without checking that the ips slice returned by LookupIPAddr is non-empty. While Go's resolver typically returns an error for unresolvable hosts, a zero-length slice with nil error is not contractually ruled out. This would cause a panic (index out of range).
    Remediation: Add a guard before accessing ips[0]: if len(ips) == 0 { return nil, fmt.Errorf("no addresses found for %q", host) }.

  • [architectural-alignment] internal/repos/manifest.goExpandGlobs delegates to forge.Client.ListOrgRepos, which excludes private, archived, and forked repos. The implementation plan (docs/plans/repos-management.md) assigns the forge interface extension for private repo inclusion to PR 3. This is by design for PR 2's scope but worth noting as a known limitation.

Low

  • [api-contract] internal/repos/manifest.go:478ResolveConfig searches m.Repos for an exact match on owner/repo, but after ExpandGlobs the manifest's Repos slice still contains the original glob patterns. The found bool return value and doc comment mitigate this, but callers who miss the documentation could silently get only defaults.

  • [error-handling] internal/repos/manifest.go:208LoadManifest enforces maxManifestBytes for local files via os.Stat followed by os.ReadFile, but there is a TOCTOU (time-of-check-time-of-use) race between the two calls. The practical risk is negligible since manifest files are not typically modified concurrently.

  • [edge-case] internal/repos/manifest.go:151MarshalYAML checks n.Null before n.Set, so the invalid state NullableString{Set: false, Null: true} would serialize as YAML null rather than being omitted. All current code paths maintain the invariant that Null implies Set, but the marshal method does not enforce it.

  • [path-traversal] internal/repos/manifest.go:206 — The file path branch of LoadManifest passes pathOrURL directly to os.Stat and os.ReadFile without sanitization. The code documents the contract ("Callers must ensure the path is safe"). Since this is a new internal library with no callers yet, the risk is deferred to future caller implementations.

  • [scope-creep] internal/forge/fake.go — The PR modifies FakeClient to add an OrgRepos field and per-org lookup logic in ListOrgRepos. This is a minimal test infrastructure change (one field, ~5 lines of logic) required for TestExpandGlobs_MultiOrg and does not modify the forge.Client interface.

  • [naming-consistency] PR title uses feat prefix, but this PR adds internal infrastructure with no user-facing functionality until later PRs wire it into the CLI. The build or chore prefix may be more accurate, though feat is defensible if the manifest parser is considered the feature itself.

Previous run (7)

Review

Findings

Medium

  • [SSRF] internal/repos/manifest.go:220fetchManifestURL does not restrict the destination host or IP address. The CheckRedirect callback (line 230) limits redirect count to 3 but does not validate the redirect target scheme or host, so an initial https:// URL can redirect to http://169.254.169.254/latest/meta-data/. The Transport uses a plain net.Dialer without IP filtering, and inherits environment proxy settings (HTTP_PROXY). Exploitability depends on attacker control of the manifest URL configuration.
    Remediation: Add a custom DialContext that resolves the target hostname and rejects connections to loopback (127.0.0.0/8, ::1), link-local (169.254.0.0/16, fe80::/10), and private (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) IP ranges. In CheckRedirect, verify each redirect target uses the https scheme. Set Transport.Proxy to nil to disable environment proxy variables.

Low

  • [api-contract] internal/repos/manifest.go:436ResolveConfig searches m.Repos for an exact match on owner/repo, but after ExpandGlobs the manifest's Repos slice still contains the original glob patterns. Callers who expand globs and then call ResolveConfig instead of ResolveConfigForEntry will silently get only defaults with no indication that the lookup failed. The doc comment documents this pitfall, but the API does not signal a miss.
    Remediation: Consider returning (ResolvedConfig, bool) so callers can detect a miss.

  • [error-handling] internal/repos/manifest.go:205LoadManifest does not enforce maxManifestBytes when reading local files via os.ReadFile, unlike the HTTPS code path which enforces a 1 MB limit via LimitReader. A very large local file would be loaded entirely into memory.

  • [path-traversal] internal/repos/manifest.go:205 — The file path branch of LoadManifest passes pathOrURL directly to os.ReadFile without sanitization. If a future caller passes user-controlled input, any readable path will be read. Currently no callers outside this package and the test file.

Previous run (8)

Review

Findings

Medium

  • [logic-error] internal/repos/manifest.go:410resolveField treats an explicitly set empty string (Set=true, Value="") identically to an omitted field, falling through to the fallback. This contradicts the three-state NullableString design: YAML inference_project: "" parses as {Set: true, Null: false, Value: ""} but resolveField ignores it because of the override.Value != "" check. The test at line 1249 documents this as intentional behavior, but it creates a confusing fourth semantic state that is not clearly communicated to callers.
    Remediation: Change the condition from if override.Set && override.Value != "" to if override.Set to respect explicitly-set empty strings, or document this fourth semantic state clearly in the NullableString and resolveField doc comments.

  • [api-contract] internal/repos/manifest.go:378ResolveConfig searches m.Repos for an exact match on owner/repo, but after ExpandGlobs the manifest's Repos slice still contains glob patterns (e.g., acme/service-*). Calling ResolveConfig("acme", "service-api") after glob expansion fails to find a match and silently returns only defaults, discarding per-glob overrides. Since this is PR 2 of 8, downstream callers built in later PRs could hit this API gap.
    Remediation: Either have ResolveConfig accept a ResolvedRepo/RepoEntry directly, or document that ResolveConfig must not be called for glob-matched repos, or have ExpandGlobs mutate m.Repos to contain expanded entries.

  • [input-validation] internal/repos/manifest.go:185LoadManifest accepts http:// URLs in addition to https://, but the function's doc comment says "HTTPS URL" and Validate enforces HTTPS for mint.url. This inconsistency allows insecure manifest fetching.
    Remediation: Remove http:// from the prefix check or add explicit rejection: if strings.HasPrefix(pathOrURL, "http://") { return nil, fmt.Errorf("insecure http:// not supported; use https://") }.

  • [SSRF] internal/repos/manifest.go:186LoadManifest() performs http.Get with no timeout, no response size limit, and no host restriction. In a CLI context the attack surface is limited to the local operator, but defense-in-depth hardening is appropriate for foundation code.
    Remediation: Set an explicit timeout on the HTTP client and limit response body size with io.LimitReader.

  • [resource-exhaustion] internal/repos/manifest.go:194io.ReadAll(resp.Body) reads the entire HTTP response into memory without any size limit. A misconfigured or compromised server could cause OOM.
    Remediation: Replace with io.ReadAll(io.LimitReader(resp.Body, maxManifestSize)) where maxManifestSize is a reasonable constant (e.g., 1 MB).

Low

  • [sorting-implementation] internal/repos/manifest.go:357 — Uses nested-loop bubble sort O(n2) instead of sort.Slice from the standard library. The codebase uses sort.Strings and sort.Slice elsewhere.

  • [error-handling-pattern] internal/repos/manifest.go:314 — Error wrapping "listing repos for org %q: %w" could include the glob pattern being expanded for better debugging context.

  • [error-message-format] internal/repos/manifest.go:248 — Error message uses parenthetical format "invalid repo format %q (expected owner/repo)" while the codebase pattern uses "must be in owner/repo format" style (e.g., internal/cli/run.go, internal/cli/admin.go).

  • [test-inadequate] internal/repos/manifest_test.go:1263TestExpandGlobs_MultiOrg doesn't meaningfully test multi-org behavior because FakeClient.ListOrgRepos ignores the org parameter. Assertions use assert.True(t, len(resolved) > 0) which passes even if the function returns completely wrong results.

  • [comment-style] internal/repos/manifest.go:186 — nolint directive says "URL is validated by caller" but Validate() is a separate public API not guaranteed to be called before LoadManifest. More accurate: // URL scheme is checked above.

  • [SSRF-redirect] internal/repos/manifest.go:186http.Get uses the default HTTP client which follows redirects automatically, potentially bypassing any pre-request URL validation.

  • [path-traversal] internal/repos/manifest.go:199 — File path passed directly to os.ReadFile with no sanitization. In CLI context this is expected behavior (user controls the path), but the caller contract should be documented.

  • [insecure-deserialization] internal/repos/manifest.go:206yaml.Unmarshal on untrusted input from a remote URL. Go's yaml.v3 is safe against code execution but crafted YAML with deeply nested structures could cause excessive memory/CPU usage. Defense-in-depth: pair with the response size limit.

  • [method-documentation] internal/repos/manifest.go:270ExpandGlobs godoc already mentions "via the forge API" but could note network access requirement more explicitly.

  • [test-organization] internal/repos/manifest_test.go:445 — Package-level test constant validManifest differs from the codebase pattern of inlining YAML in individual tests.


Labels: New Go package implementing repos manifest feature (ADR 0057).

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment go Pull requests that update go code type/feature New capability request labels Jul 4, 2026
@ggallen ggallen force-pushed the feat/repos-manifest-parser branch from e62477c to 3c23433 Compare July 4, 2026 00:32
@fullsend-ai-review

fullsend-ai-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 12:33 AM UTC · Completed 12:46 AM UTC
Commit: 3c23433 · View workflow run →

fullsend-ai-review[bot]

This comment was marked as outdated.

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jul 4, 2026
@ggallen ggallen force-pushed the feat/repos-manifest-parser branch from 3c23433 to 9da84af Compare July 4, 2026 00:54
@fullsend-ai-review

fullsend-ai-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 12:55 AM UTC · Completed 1:08 AM UTC
Commit: 9da84af · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jul 4, 2026
@ggallen ggallen force-pushed the feat/repos-manifest-parser branch from 9da84af to 3bfad51 Compare July 4, 2026 01:11
@fullsend-ai-review

fullsend-ai-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 1:11 AM UTC · Completed 1:23 AM UTC
Commit: 3bfad51 · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added requires-manual-review Review requires human judgment and removed requires-manual-review Review requires human judgment labels Jul 4, 2026
@ggallen ggallen force-pushed the feat/repos-manifest-parser branch from 3bfad51 to f95df1d Compare July 4, 2026 01:35
@fullsend-ai-review

fullsend-ai-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 1:36 AM UTC · Completed 1:48 AM UTC
Commit: f95df1d · View workflow run →

fullsend-ai-review[bot]

This comment was marked as outdated.

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed requires-manual-review Review requires human judgment labels Jul 4, 2026
Signed-off-by: Greg Allen <gallen@redhat.com>
Signed-off-by: Claude <noreply@anthropic.com>
Signed-off-by: Greg Allen <gallen@redhat.com>
@ggallen ggallen force-pushed the feat/repos-manifest-parser branch from f95df1d to 49fe9cd Compare July 4, 2026 01:54
@fullsend-ai-review

fullsend-ai-review Bot commented Jul 4, 2026

Copy link
Copy Markdown

🤖 Finished Review · ✅ Success · Started 1:54 AM UTC · Completed 2:05 AM UTC
Commit: 49fe9cd · View workflow run →

@fullsend-ai-review fullsend-ai-review Bot added ready-for-merge All reviewers approved — ready to merge and removed ready-for-merge All reviewers approved — ready to merge labels Jul 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go Pull requests that update go code ready-for-merge All reviewers approved — ready to merge type/feature New capability request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant