Skip to content

feat(df): add df builtin for disk space usage reporting#205

Draft
julesmcrt wants to merge 18 commits intomainfrom
jules.macret/host-remediation/df
Draft

feat(df): add df builtin for disk space usage reporting#205
julesmcrt wants to merge 18 commits intomainfrom
jules.macret/host-remediation/df

Conversation

@julesmcrt
Copy link
Copy Markdown
Collaborator

Summary

Adds df as a sandboxed builtin so AI-agent scripts can inspect mounted filesystem usage without invoking the host df binary. v1 supports Linux + macOS; Windows returns "not supported" (mirroring uname).

  • Mount enumeration is delegated to a new internal package, builtins/internal/diskstats, that reads /proc/self/mountinfo on Linux and calls getfsstat(2) on macOS. The /proc read is exempt from AllowedPaths for the same reason ss and ip route are — the path is hardcoded, never derived from user input.
  • The dangerous --sync flag (which would invoke sync(2) and mutate kernel buffer state) is unregistered and rejected by pflag as unknown. GTFOBins has no df entry.

Flag set

Implemented: -h, -H, -k, -P, -T, -i, -a, -l, -t TYPE (repeatable), -x TYPE (repeatable), --total, --no-sync, --help.

Deferred to v2: [FILE]... operands, -B/--block-size, --output[=FIELDS].

Rejected (unknown to pflag): --sync, -v, --version.

Safety bounds (per docs/RULES.md)

  • Mount table capped at 100 000 entries (ErrMaxMounts returned when truncated; both Linux and Darwin honour this for parity).
  • /proc/self/mountinfo line cap of 1 MiB (errLineTooLong).
  • Scan total cap of 1 M lines (CPU-time guard against pathological all-malformed inputs).
  • percentUsed uses paired right-shifts to avoid used*100 overflow at extreme magnitudes.
  • saturatingAdd is used for grand-total accumulation so a rogue mount cannot wrap the running totals.
  • formatCount uses floor + remainder bump to avoid wraparound when grand totals saturate to MaxUint64.
  • ctx.Err() is checked at the top of every per-mount loop on both backends.

Test plan

  • go test ./... (full suite passes locally on darwin/arm64).
  • make fmt clean.
  • go vet ./... cross-platform (GOOS=linux, GOOS=windows) clean.
  • Internal parser tests with synthetic mountinfo: malformed lines, octal escapes (\040, \011, \012, \134), MaxMounts truncation, line-too-long, context cancellation.
  • End-to-end Go tests: every supported flag exercised; rejected flags produce exit 1 + stderr.
  • Live-host invariant tests (unix-only): POSIX rows are numeric, capacity column ends with %, --total row equals saturated sum of per-mount columns, -t filter keeps only matching rows, -x filter removes them.
  • GNU coreutils 9.10 byte-for-byte header fixtures (POSIX, default, -h, -i, -T, --total).
  • Pentest: 20+ subtests covering rejected flags, file-operand traversal, end-of-flags separator, many operands, very long type names, 500 repeated -t flags, weird type values (empty/whitespace/comma/UTF-8/non-UTF-8/Unicode NFD), shell-metacharacter values, attempted config overrides (--proc-path, --mountinfo, --root, --prefix).
  • Fuzz: FuzzParseMountInfo, FuzzUnescapeMountField, FuzzDfFlagCombinator with seed corpora drawn from implementation edge cases, CVE-class inputs (NUL bytes, CRLF, invalid UTF-8, ELF/PE/ZIP magic prefixes), and replays of every existing test input. Added to .github/workflows/fuzz.yml so they run in CI.
  • Help integration: help lists df with the right description; help df shows full usage.

🤖 Generated with Claude Code

Adds df as a sandboxed builtin (Linux + macOS; Windows returns "not
supported"). Mount enumeration goes through a new internal package,
builtins/internal/diskstats, which reads /proc/self/mountinfo on Linux
(documented sandbox-bypass mirroring ss/ip route — the path is hardcoded
and never derived from user input) and calls getfsstat(2) on macOS.

Supported flags: -h, -H, -k, -P, -T, -i, -a, -l, -t TYPE (repeatable),
-x TYPE (repeatable), --total, --no-sync, --help. The dangerous --sync
flag (which would invoke sync(2) and mutate kernel buffer state) is
unregistered and rejected by pflag as unknown. -B, --output, and
positional FILE operands are deferred to a future version.

Memory bounds: mount table capped at 100k entries, mountinfo line
length capped at 1 MiB, scan total capped at 1M lines. Integer arithmetic
uses paired right-shifts in percentUsed and saturating addition in totals
to avoid overflow on extreme magnitudes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6b2ec9136f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/internal/diskstats/diskstats_linux.go Outdated
Comment thread builtins/df/df.go Outdated
Comment thread builtins/df/df.go Outdated
julesmcrt and others added 3 commits April 30, 2026 13:15
CI fixes:
- Fuzz: dfRunFuzz no longer routes through testutil.RunScriptCtx, which
  fatals on shell-parse errors. The fuzzer routinely mutates inputs into
  malformed shell syntax (unclosed quotes, etc.); we now treat parse
  failures as expected and skip the iteration.
- Windows: pentest tests that exercised df's actual code path
  (TestDfPentestVeryLongTypeName, TestDfPentestManyTypeFilters,
  TestDfPentestTypeFilterEdgeValues, TestDfPentestNonUTF8FlagValue,
  TestDfPentestUnicodeNFD, TestDfPentestQuotedValues) now skip via
  requireSupported — df returns "not supported" on Windows, which made
  the asserted code==0 invariant unreachable.

Codex feedback:
- P1: Remove "overlay" from the Linux pseudo-FS table. Container hosts
  use overlay as the default root filesystem; classifying it as pseudo
  hid the real root from the default listing.
- P2: An explicit -t TYPE filter now overrides the default pseudo-FS
  suppression. `df -t tmpfs` lists tmpfs mounts without requiring -a,
  matching GNU df.
- P2: humanBytes rounds up via math.Ceil instead of fmt.Sprintf's
  round-to-nearest, matching GNU df's "never under-report" rule.
  Example: 1,576,960 bytes is now "1.6M" (was "1.5M").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Windows test failures: TestDfPentestNonUTF8FlagValue, TestDfPentestUnicodeNFD,
and TestDfPentestQuotedValues were missing requireSupported(t) so they ran on
Windows where df returns "not supported" (code 1), making the asserted code==0
unreachable. Add the skip alongside the seven other Windows-skipped tests.

Fuzz contract: the previous version asserted exit codes in {0, 1, 127} only,
which broke on legitimate runner behaviour like "df 0&" → code 2 (background
job). The fuzz target's real contract is "no panic and no hang inside df";
both are enforced by Go's testing framework and the helper's 5-second timeout
respectively. Drop the exit-code assertion entirely. Also stop fataling on
non-ExitStatus runner errors (glob expansion failures, "internal error" on
adversarial inputs) — those are runner behaviour, not df defects. Verified by
running the fuzzer locally for 20s (670k execs) without failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
13 scenario tests under tests/scenarios/cmd/df/ exercised df's actual
code path and asserted exit code 0 with output rows. On Windows, df
returns "not supported" with exit 1. The scenario framework has no
platform-skip mechanism (per AGENTS.md note about ip route), so happy-
path scenarios cannot be added for platform-restricted commands.

The deleted scenarios were redundant with builtins/df/df_unix_test.go,
which uses //go:build unix to exercise the same flag wiring with
structural assertions on the live mount table — a richer check than
the stdout_contains substring match the YAML scenarios provided.

Retained 6 scenarios that are platform-agnostic (they short-circuit
before diskstats.List): --help, extra operand, unknown flag,
--sync rejection, -B/--block-size rejection, --output rejection.

Deleted: basic/default_succeeds.yaml, flags/all.yaml, exclude_type.yaml,
human_readable.yaml, inodes.yaml, k_is_default.yaml, local.yaml,
no_sync.yaml, posix_format.yaml, print_type.yaml, si.yaml, total.yaml,
type_filter.yaml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5392e09b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/internal/diskstats/diskstats_linux.go Outdated
Comment thread builtins/df/df.go Outdated
Comment thread builtins/df/df.go
Comment thread builtins/df/df.go Outdated
julesmcrt and others added 2 commits April 30, 2026 14:33
Four P2 items, all verified against gdf 9.10 locally:

1. tmpfs / devtmpfs are no longer in pseudoTypes. They report real
   storage (/dev/shm, /run) and GNU lists nonzero tmpfs mounts in the
   default output. Default df under-reported them previously.

2. df -t TYPE / -x TYPE now exit 1 with "no file systems processed" on
   stderr when filtering leaves zero rows. GNU df returns 1 in this
   case and scripts test the exit status to detect missing filesystem
   types — silent exit-0 with empty body was a regression.

3. df -ih / df -iH now scales inode counts via humanBytes (e.g. 4.0G,
   381K). Previously inode mode bypassed human formatting and always
   printed raw integers, which broke a common usage even though both
   -i and -h are documented as supported.

4. df -P with -h or -H now uses the "Size" column header instead of
   "1024-blocks". GNU's -P -h emits "Size" — keeping the fixed-block
   label under human-suffixed values would mislead parsers about units.

Updated tests:
- TestDfTypeFilter_NoMatches: now asserts exit 1 + stderr message.
- TestFormatCount: covers the -ih and -iH cases.
- TestBuildHeader: pins -P -h and -P -H to "Size".
- TestDfPentest{VeryLongTypeName, ManyTypeFilters, TypeFilterEdgeValues,
  NonUTF8FlagValue, UnicodeNFD, QuotedValues}: relaxed to accept exit 0
  or 1 (no-match values are common in pentest inputs).
- TestDfPentestTypeIncludeAndExcludeSameType: now asserts the new
  exit-1 + "no file systems processed" contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-on test fixes after the round-2 GNU-compat changes (commit
7b62dfc):

1. TestParseMountInfo_HappyPath: asserted devtmpfs.Pseudo == true, but
   devtmpfs was just removed from pseudoTypes. Flip the assertion to
   match the new (correct) classification, and add an analogous check
   for the /run tmpfs entry to lock in tmpfs.Pseudo == false.

2. TestDfPentestAllFlagsAtOnce: asserted exit 0, but on Linux the -t
   apfs filter matches no rows so df now correctly emits "no file
   systems processed" and exits 1 (the new GNU-compat path). Relax to
   accept either 0 or 1; the pentest's contract is "stacking every
   flag does not crash", not "succeeds with this specific argv on
   every host".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1a7d813ef6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/internal/diskstats/diskstats_linux.go
Comment thread builtins/df/df.go
Comment thread builtins/df/df.go Outdated
…ception

Three Codex round-3 items:

1. P1 — pre-stat filter (real bug). statfs(2) on a stale NFS / CIFS
   mount can hang indefinitely and is not interrupted by ctx
   cancellation. Previously diskstats.List statfs'd every mount in
   /proc/self/mountinfo before df.go could apply -l / -x nfs filters,
   so `df -l` could hang on a dead remote even though the user had
   explicitly asked to skip remote mounts.
   Fix: diskstats.List now takes a FilterFunc parameter. df.go's
   makePreStatFilter encodes -t/-x/-a/-l in a closure that runs
   between mountinfo parsing and the statfs syscall on Linux. Darwin
   already uses MNT_NOWAIT so the filter is cosmetic there.

2. P1 — AllowedPaths and statfs (documentation). statfs returns
   metadata only (block/inode counts, fs type, block size); no file
   content is read. The mount-point paths are kernel-controlled,
   never user-derived. This is the same exception class as ss reading
   all sockets and ip route reading the full routing table — gating
   it on AllowedPaths would produce a misleading partial listing.
   Documented under "Security Design Decisions" in AGENTS.md so the
   trade-off is explicit.

3. P2 — dedup bind-mounts. GNU df hides mounts that share a Source
   with an already-emitted mount unless -a is given. On container
   hosts with overlay bind-mounts of /etc/hosts, /etc/hostname,
   /etc/resolv.conf, default df was printing three duplicate rows
   and --total was double-counting them.
   Fix: filterMounts now keeps a seen-Source set; only the first mount
   for each Source is emitted unless -a is set. Empty Source values
   (rare; some pseudo filesystems) are not collapsed onto each other.

Tests:
- New TestPreStatFilter_* coverage replaces TestFilterMounts_* for the
  type/pseudo/local logic that moved into makePreStatFilter.
- New TestFilterMounts_DedupBySourceWithoutAll, _AllPreservesDuplicates,
  _EmptySourceNotDeduped lock in the dedup behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc77c89e8a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/internal/diskstats/diskstats_linux.go
Comment thread AGENTS.md
Comment thread builtins/df/df.go Outdated
… promotion

Three Codex round-4 items, all real:

1. P1 — FUSE remote subtypes (hang protection for sshfs etc.). Linux
   mountinfo reports FUSE mounts as fuse.<subtype>, e.g. fuse.sshfs
   or fuse.smbnetfs. The bare prefix list ("sshfs", "smb", …) did not
   match these strings, so a stale fuse.sshfs mount was still statfs'd
   and could hang despite -l / -x. Added explicit fuse.<remote-backend>
   entries: fuse.sshfs, fuse.smb, fuse.cifs, fuse.davfs, fuse.glusterfs,
   fuse.cephfs, fuse.nfs, fuse.s3, fuse.rclone. TestIsRemoteType extended
   with positive cases for each subtype and negative cases for the
   local FUSE backends (gvfsd-fuse, portal, archivemount).

2. P1 — Document df bypass in README.md. The "Security Model" note
   already covered ss and ip route but not df; operators reading the
   README would miss that AllowedPaths cannot hide mount metadata
   from df. Updated the bullet to list df alongside ss/ip route and
   to spell out that Statfs(2) returns metadata only.

3. P2 — humanBytes promotes after rounding. humanBytes(1048575, 1024)
   was returning "1024K" instead of "1.0M" because the suffix was
   chosen before the ceiling. Refactored: one rounding pass with
   granularity decided by pre-rounded magnitude (one decimal if
   scaled < 10, integer otherwise), then a single promotion step when
   the rounded value reaches base. df -h now matches gdf -h
   byte-for-byte on my host (927G/512G/415G — previously 926G off-by-one).
   Restored the 1<<20-1 → 1.0M test case I had dropped, plus 1<<30-1 →
   1.0G, 1<<40-1 → 1.0T, 10485759 → 10M.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d94c7f3d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go Outdated
Comment on lines +282 to +283
if _, dup := seen[m.Source]; dup {
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Deduplicate by device and shortest mount point

GNU df documents that, without -a, bind-mount duplicates are elided by device and the shortest mount point is kept. This source-key check keeps the first same-source mount instead; on hosts like this container, kataShared appears as /etc/resolv.conf, /etc/hostname, and /etc/hosts, so rshell reports /etc/resolv.conf while GNU reports /etc/hosts, and unrelated filesystems that share a source string can be dropped. Use device identity and shortest target rather than Source alone.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Added a DevID field to Mount (parsed from /proc/self/mountinfo field 2 on Linux, formatted from Statfs_t.Fsid on Darwin) and rewrote filterMounts as a two-pass dedup: first pass picks the index of the shortest-mountpoint entry per DevID, second pass emits in original order. Two unrelated overlay mounts with distinct DevIDs are no longer collapsed (the kataShared bug). Tests: TestFilterMounts_DedupByDevicePicksShortestMountpoint (kataShared scenario verbatim) and TestFilterMounts_DistinctDeviceSameSourceNotDeduped.

Comment thread builtins/df/df.go
Comment on lines +250 to +251
if local && !m.Local {
return false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve pseudo filesystems for df -al

When both -a and -l are supplied, -a explicitly asks to include dummy/pseudo filesystems while -l only limits out remote filesystems; GNU df -al includes local pseudo mounts such as proc, sysfs, and cgroup2. Because Local is false for every pseudo mount, this filter drops them anyway whenever -l is set, so df -al silently omits the entries -a was meant to restore. Track remote-ness separately or let pseudo mounts pass when all is set.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Local was conflating "not pseudo" with "not remote". Per GNU df, pseudo and remote are independent classifications — pseudo mounts (proc, sysfs, cgroup2) are local. Changed the Linux and Darwin classifications to Local := !isRemoteType(...) so pseudo mounts pass the -l filter. Verified df -al now lists more entries than default (pseudo mounts re-enabled by -a still survive -l). Added TestPreStatFilter_AllPlusLocalKeepsPseudo to lock this in.

Comment thread builtins/df/df.go Outdated
Comment on lines +298 to +300
// Allow comma-separated lists, matching GNU df.
for p := range strings.SplitSeq(v, ",") {
if p == "" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Keep comma-separated fstype arguments literal

GNU df specifies that multiple filesystem types are selected by giving multiple -t options, not by comma-splitting one argument. This split makes df -t overlay,tmpfs match tmpfs mounts in rshell, while GNU treats overlay,tmpfs as one literal type and exits with no file systems processed; scripts that rely on that exit status will get the wrong result.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Verified empirically: gdf -t apfs,ext4 exits 1 with "no file systems processed" — GNU treats apfs,ext4 as a single literal type. Removed comma-splitting from stringSet; -t / -x values are now stored as keyed by the literal argv string. Updated TestStringSet to pin the new behaviour and removed strings.SplitSeq from the symbol allowlist (no longer used).

…split -t

Three Codex round-5 items, all real:

1. P2 — pseudo + -l. Mount.Local was defined as "not pseudo and not
   remote", so pseudo mounts (proc, sysfs, cgroup2) had Local=false
   and were silently dropped by `df -al`. GNU df treats pseudo and
   remote as independent classifications: -l drops only remote
   filesystems, and pseudo passes when -a is set. Redefined
   Local := !isRemoteType(...) on both Linux and Darwin so pseudo
   mounts are local (they live in kernel memory, not on a remote
   server). New TestPreStatFilter_AllPlusLocalKeepsPseudo locks this
   in. Verified locally: `df -al` now lists more entries than `df`,
   pseudo mounts re-enabled by -a survive -l.

2. P2 — dedup by device + shortest mount point. The previous
   Source-string dedup was brittle: two unrelated overlay mounts
   sharing a literal source name were collapsed (the kataShared bug
   Codex flagged), and the chosen representative depended on input
   order rather than mount-point length.
   Added a DevID field to Mount (parsed from /proc/self/mountinfo
   field index 2 on Linux, formatted from Statfs_t.Fsid on Darwin)
   and rewrote filterMounts as a two-pass dedup: first pass picks the
   index of the shortest-mountpoint entry per DevID, second pass
   emits in original order. Distinct DevIDs are preserved even when
   Source matches.
   New tests: TestFilterMounts_DedupByDevicePicksShortestMountpoint
   (kataShared scenario verbatim) and
   TestFilterMounts_DistinctDeviceSameSourceNotDeduped.

3. P3 — drop comma-split in -t / -x. Verified empirically:
   `gdf -t apfs,ext4` exits 1 with "no file systems processed" — GNU
   treats the entire string as a single literal type. The previous
   code split on commas and matched either side, which broke scripts
   that used the no-match exit code as a presence test.
   Updated stringSet to store the verbatim argv values; removed
   strings.SplitSeq from the symbol allowlist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

@julesmcrt julesmcrt added the verified/analysis Human-reviewed static analysis changes label Apr 30, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05b3504ff3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +146 to +150
m.Total = uint64(st.Blocks) * bsize
m.Free = uint64(st.Bavail) * bsize
// Used is computed from f_blocks - f_bfree (root-reserved
// blocks are counted as used), which differs from Total - Free.
m.Used = subSat(uint64(st.Blocks), uint64(st.Bfree)) * bsize
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Saturate statfs block multiplications

When a FUSE or otherwise buggy filesystem reports Blocks, Bavail, or Bfree above math.MaxUint64 / bsize, these products wrap before the later saturatingAdd total logic sees them, so a single rogue mount can print tiny/zero sizes and corrupt --total. This also violates docs/RULES.md's requirement that commands check integer overflow in arithmetic; use a checked/saturating multiply for Total, Free, and Used before formatting or accumulating.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Added a mulSat(a, b uint64) helper in diskstats_unix.go that clamps to MaxUint64 when a*b would wrap. The Linux and Darwin backends now use mulSat for Total, Free, and Used instead of plain *. A buggy/malicious FUSE filesystem reporting block counts above MaxUint64/bsize can no longer wrap a single mount to a tiny size or corrupt --total. Locked in by TestMulSat covering boundary, just-over-boundary, and the realistic FUSE-rogue (maxU * 4096) case.

Comment thread builtins/df/df.go Outdated
Comment on lines +215 to +219
if *f.human {
return unitsHuman1024
}
if *f.si {
return unitsHuman1000
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor the last size-unit flag

When users pass both unit flags, e.g. df -hH, pflag records both booleans but this priority always returns IEC/1024 units. GNU df documents -h as powers of 1024 and -H/--si as powers of 1000 (https://www.gnu.org/software/coreutils/df), and the later option determines the requested mode (df -hH / should be SI while df -Hh / should be IEC), so scripts that append -H to override -h get the wrong sizes.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — the previous code unconditionally preferred -h over -H. pflag.FlagSet.Visit walks set flags in lexicographical order, not argv order, so it could not be used to honor input order. Refactored to a custom unitFlag (a pflag.Value) where -h and -H share a single *unitMode target and each Set call overwrites it — the LAST one wins by construction. Confirmed locally: df -hH now prints SI (995G on my host) while df -Hh prints IEC (927G). Locked in by TestUnitFlag_LastFlagWins covering 10 argv interleavings (separate flags, combined short, long-form, with non-unit flags between).

Two Codex round-6 items, both real:

1. P1 — saturating block multiply. unix.Statfs_t.Blocks * bsize could
   wrap a uint64 if a buggy/malicious FUSE filesystem reports counts
   above MaxUint64/bsize, producing tiny/zero Total/Used/Free for that
   mount and corrupting the --total accumulation. Added a mulSat(a, b)
   helper to diskstats_unix.go that clamps to MaxUint64 on overflow.
   Linux and Darwin backends now use mulSat for Total, Free, and Used.
   Locked in by TestMulSat covering boundary, just-over-boundary, and
   the realistic FUSE-rogue (maxU * 4096) case.

2. P2 — last-flag-wins for -h / -H. The previous code unconditionally
   preferred -h, so `df -hH` (intended SI override) and shell aliases
   that append -H to a default got the wrong size column. pflag.Visit
   walks set flags in lexicographical order, not argv order, so it
   cannot be used to honor input order. Refactored to a custom
   unitFlag (a pflag.Value) where -h and -H share a single *unitMode
   target and each Set call overwrites it — the LAST one wins by
   construction. registerUnitFlag wraps fs.VarPF + NoOptDefVal="true"
   so the flags accept no argument, matching pflag's bool convention.
   Confirmed locally: `df -hH` prints SI (995G), `df -Hh` prints IEC
   (927G). New TestUnitFlag_LastFlagWins covers 10 argv interleavings
   including combined short flags (-hH, -Hh).

Side effect of #2: dropped the unused `human`/`si` *bool fields from
the flags struct and the resolveUnitMode helper; mode is now read
directly from a *unitMode field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc63258ac6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go Outdated
Comment on lines +172 to +175
// -k is registered separately because it has no long form. It is a
// no-op in this v1 implementation — 1024-byte blocks are already
// the default — but POSIX scripts pass it explicitly.
fs.BoolP("kibibytes", "k", false, "use 1024-byte blocks (POSIX default)")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Let -k override earlier human-size flags

When -k appears after a human-readable size flag, this no-op registration leaves mode set to unitsHuman1024/unitsHuman1000, so df -h -k, df -H -k, and compact forms like df -hk still print Size with human units. GNU documents -k as “equivalent to --block-size=1K” (and local df -h -k prints a 1K-blocks header), so scripts that rely on option-order compatibility get the wrong columns and values.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified empirically: gdf -h -k prints 1K-blocks and gdf -k -h prints Size. Promoted -k into the same last-flag-wins group as -h / -H by registering it via registerUnitFlag with unitsK value. Now all three share the *unitMode target and the LAST one in argv wins. Locked in by extending TestUnitFlag_LastFlagWins with six new cases (-h -k, -H -k, -k -h, -k -H, -hk, -kh).

Comment thread builtins/df/df.go
Comment on lines +254 to +259
if _, ok := excludeSet[m.FSType]; ok {
return false
}
if len(includeSet) > 0 {
if _, ok := includeSet[m.FSType]; !ok {
return false
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject overlapping -t and -x filters

When the same filesystem type is passed to both -t and -x, this branch makes exclusion silently win; for example df -t tmpfs -t ext4 -x tmpfs succeeds here and lists ext4, while GNU df exits 1 with “file system type ‘tmpfs’ both selected and excluded”. That hides an invalid invocation and diverges from the bash-compatible behavior required for these builtins whenever configs/scripts accidentally specify contradictory type filters.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified gdf -t apfs -x apfs exits 1 with "file system type 'apfs' both selected and excluded". Added an overlappingType helper that scans the include / exclude argv for any conflicting type and returns it (or ""). The handler now checks this before any other work and emits the same GNU-format error, so contradictory configs fail loudly instead of silently picking exclude. TestOverlappingType covers nil/empty/disjoint/single/multiple-overlap inputs. Updated TestDfPentestTypeIncludeAndExcludeSameType to assert the new error. The lower-level TestPreStatFilter_TypeExcludeWinsOverIncludeOnPseudo is kept and re-doc'd as testing the filter layer's exclude-precedence in isolation, since the top-level handler now blocks that input from ever reaching the filter.

Two Codex round-7 items, both verified empirically against gdf 9.10:

1. P2 — -k participates in the unit-mode last-flag-wins group. Before,
   -k was a no-op bool flag that ran alongside -h / -H without ever
   updating the shared mode, so `df -h -k` left mode = unitsHuman1024
   even though gdf -h -k prints "1K-blocks". Promoted -k into the same
   registerUnitFlag group with unitsK value: all three flags now share
   the *unitMode target and the LAST argv entry wins.
   Locked in: TestUnitFlag_LastFlagWins gains six new cases (-h -k,
   -H -k, -k -h, -k -H, -hk, -kh). Smoke verified: df -h -k now
   prints "1K-blocks", df -k -h prints "Size".

2. P2 — reject overlapping -t / -x. gdf -t apfs -x apfs exits 1 with
   "file system type 'apfs' both selected and excluded". The previous
   code silently let exclusion win. Added an overlappingType helper
   and an early-out check in the handler that emits the GNU-format
   error before any mount listing runs. Updated
   TestDfPentestTypeIncludeAndExcludeSameType to assert the new
   error; kept TestPreStatFilter_TypeExcludeWinsOverIncludeOnPseudo
   to lock in the filter's lower-level exclude-precedence behaviour
   in isolation. Added TestOverlappingType for the helper itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 755c2fe837

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go Outdated
if withType {
cols = append(cols, "Type")
}
cols = append(cols, col1, "Used", "Available", capacity, last)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use GNU's Avail header for human-readable output

When users run df -h or df -H, GNU coreutils labels the available-space column Avail (for example, GNU df 9.4 prints Filesystem Size Used Avail Use% Mounted on), but this unconditional Available makes rshell diverge in one of the most common df modes. Any bash/coreutils comparison scenario for human-readable df output will fail; keep Available for fixed-block output and switch to Avail when the selected unit mode is human-readable.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified gdf -h prints Filesystem Size Used Avail Use% Mounted on (no 'e' on Avail). buildHeader now switches AvailableAvail when the unit mode is unitsHuman1024 or unitsHuman1000; fixed-block modes (default, -k, -P) keep the full Available. TestBuildHeader covers both human cases and TestGNUCompatHeaderHuman was strengthened to assert Avail is present and Available is not.

Comment thread builtins/df/df.go Outdated
Comment on lines +582 to +584
if posix {
// POSIX output format for inodes still uses "Capacity".
cols[len(cols)-2] = "Capacity"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep IUse% for inode output with -P

When -P is combined with inode mode, GNU df keeps the inode percentage header as IUse% (df -iP prints Filesystem Inodes IUsed IFree IUse% Mounted on), but this replacement changes it to Capacity. Scripts or bash-comparison tests that inspect the inode header will see a different column name; only the block POSIX format should use Capacity.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified gdf -iP prints Filesystem Inodes IUsed IFree IUse% Mounted on (the inode percentage stays IUse%, not Capacity). Removed the conditional that rewrote IUse%Capacity when -P was set in inode mode. Block POSIX (df -P without -i) still uses Capacity as required. Updated TestBuildHeader to assert that -iP keeps IUse% and does NOT contain Capacity.

Two Codex round-8 items, both verified empirically against gdf 9.10:

1. P2 — GNU compresses "Available" to "Avail" in human modes
   (`gdf -h /` → "Filesystem Size Used Avail Use% Mounted on"). My
   buildHeader unconditionally emitted "Available", so any
   bash-comparison scenario for `df -h` / `df -H` would diverge.
   buildHeader now switches to "Avail" when the unit mode is
   unitsHuman1024 or unitsHuman1000; fixed-block modes (default, -k,
   -P) keep the full "Available" string.

2. P2 — `gdf -iP` keeps "IUse%" as the percentage header, not
   "Capacity". Only the *block* POSIX format substitutes "Capacity"
   for "Use%". My code unconditionally replaced "IUse%" with
   "Capacity" when posix was set in inode mode, which diverged from
   GNU. Removed the conditional; the inode header is now always
   "IUse%" regardless of -P.

Tests: TestBuildHeader expanded to pin both behaviours (Avail in -h /
-H, IUse% preserved with -iP, IUse% not replaced by Capacity). Also
strengthened TestGNUCompatHeaderHuman to explicitly check that "Avail"
is present and "Available" is NOT.

Smoke verified locally:
  df -h  → Filesystem ... Size Used Avail Use% Mounted on
  df -iP → Filesystem ... Inodes IUsed IFree IUse% Mounted on
  df -P  → Filesystem ... 1024-blocks Used Available Capacity Mounted on

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b3d483519

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go
Comment on lines +379 to +381
posix := *f.posix
withType := *f.printType
inodeMode := *f.inodes
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Limit POSIX layout to fixed block output

When -P is combined with -T, -i, -h, or -H, GNU df uses the normal aligned layout for these GNU-extended modes (for example, df -Pi and df -PT have padded columns, and df -hP keeps the Use% header). Because posix stays true here, printRows emits single-space rows and buildHeader can rename the percent column to Capacity, so supported flag combinations diverge from bash/coreutils output.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified empirically against gdf 9.10:

  • gdf -hP / → "Use%" header (not "Capacity")
  • gdf -PT / → aligned column layout (not single-space)
  • gdf -Pi / → aligned, IUse% preserved

Two fixes:

  1. buildHeader: capacity := "Capacity" only when posix && !human. Human modes (-h / -H) keep "Use%" even with -P.
  2. writeOutput: the single-space POSIX row layout is now gated on posix && !withType && !inodeMode && !human — i.e. only when -P is the sole format-affecting flag. -PT, -Pi, -hP, -HP all use the default aligned layout.

Tests strengthened: TestBuildHeader pins -hP/-HP → Use%/Avail (no Capacity), and -PT keeps Capacity. Smoke verified locally; all four scenarios match gdf.

Comment thread builtins/df/df.go Outdated
// -h, and `df -k -h` prints "Size" for the reverse reason.
registerUnitFlag(fs, &mode, unitsHuman1024, "human-readable", "h", "print sizes in powers of 1024 (e.g. 1023M)")
registerUnitFlag(fs, &mode, unitsHuman1000, "si", "H", "print sizes in powers of 1000 (e.g. 1.1G)")
registerUnitFlag(fs, &mode, unitsK, "kibibytes", "k", "use 1024-byte blocks (POSIX default)")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Reject the nonstandard --kibibytes flag

This long name makes df --kibibytes succeed even though the documented/GNU-compatible interface only has short -k (GNU documents -k as equivalent to --block-size=1K, not --kibibytes). Since unsupported flags are supposed to fail, scripts can accidentally rely on rshell-only behavior and help advertises a flag that real df rejects.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified gdf --kibibytes returns "unrecognized option '--kibibytes'". GNU only documents the short -k (per the manual: "equivalent to --block-size=1K", not a long form named --kibibytes). Re-registered -k via fs.VarPF with an empty long name so only the short form is accepted; df --kibibytes now exits 1 with "unknown flag: --kibibytes" matching GNU. Added df --kibibytes to the rejected-flag pentest list.

@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b3d483519

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go Outdated
if _, ok := includeSet[m.FSType]; !ok {
return false
}
} else if !all && m.Pseudo {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep pseudo filesystems hidden under -t

When -t is used for a pseudo type such as proc or sysfs, len(includeSet)>0 makes this else if skip the default pseudo-filesystem filter. I checked GNU coreutils df 9.4: df -t proc exits 1 with no file systems processed, while df -a -t proc lists proc mounts. Rshell will instead stat and print /proc successfully, which breaks scripts that use df's exit status and exposes pseudo mounts even though -a was not requested.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified empirically: gdf -t devfs / (devfs is pseudo on macOS) → exit 1 with "no file systems processed". My earlier round-2 fix (making -t override pseudo suppression) was over-corrected; the right rule is that -t and the pseudo filter are independent — only -a exposes pseudo filesystems. -t tmpfs continues to work because tmpfs was removed from the pseudoTypes table in round 5 (it has real RAM-backed storage and GNU classifies it that way), not because -t bypasses suppression. Removed the else if branch in makePreStatFilter so the pseudo check fires regardless of includeSet. Replaced TestPreStatFilter_TypeIncludeOverridesPseudoSuppression with TestPreStatFilter_TypeIncludeRespectsPseudoSuppression covering both -t proc (drops) and -a -t proc (lists). Smoke verified df -t devfs now exits 1.

julesmcrt and others added 2 commits May 1, 2026 09:27
Two Codex review-4207024894 items I missed earlier; verified empirically
against gdf 9.10:

1. P2 — Limit POSIX layout to fixed-block output. GNU df only uses the
   strict POSIX single-space row layout when -P is the *sole*
   format-affecting flag. Combinations like -PT, -Pi, -hP, -HP all
   revert to the default aligned column layout, and human modes (-h /
   -H) keep "Use%" instead of "Capacity" even with -P.
   Two narrow fixes:
     * buildHeader: capacity = "Capacity" only when posix && !human.
     * writeOutput: pass posixLayout = posix && !withType && !inodeMode
       && !human to printRows. Single-space rows now apply only to
       `df -P` (with optional -k/-a/-l/-t/-x).
   Smoke verified: df -hP / df -HP → Use% + aligned; df -PT / df -Pi
   → aligned; df -P alone → single-space + Capacity.

2. P3 — Reject the nonstandard --kibibytes flag. GNU only documents
   the short -k; there is no --kibibytes long form. Previously rshell
   accepted both, which let scripts depend on rshell-only behavior.
   Re-registered -k via fs.VarPF with an empty long name so only the
   short form is recognised. df --kibibytes now exits 1 with
   "unknown flag: --kibibytes", matching gdf.
   Added "df --kibibytes" to the rejected-flag pentest list.

Tests: TestBuildHeader strengthened to assert -hP/-HP keep Use% and
do NOT contain Capacity, and that -PT does keep Capacity. Pentest's
TestDfPentestRejectedFlags now includes --kibibytes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lock file is local Claude session state and shouldn't be tracked.
Removing the accidental check-in from the previous commit and updating
.gitignore to prevent recurrence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5438ea4db

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +42 to +44
runner, err := interp.New(
interp.StdIO(nil, &outBuf, &errBuf),
interp.AllowedPaths(nil),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enable df in the fuzz runner

In the df fuzz target, this runner is created without AllowedCommands; the interpreter default is “no commands are allowed” (interp/api.go documents this), unlike testutil.RunScriptCtx which injects AllowAllCommands. As a result, every fuzz input that starts with df is rejected as df: command not allowed before the new builtin runs, and the new CI fuzz job for builtins/df does not exercise df parsing or output at all.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified the runner default rejects every command. Without AllowAllCommands, every fuzz iteration was getting "command not allowed" before df parsing ever ran. Added interpoption.AllowAllCommands().(interp.RunnerOption) to dfRunFuzz's interp.New options (matches what testutil.RunScriptCtx does). Verified: go test -fuzz=FuzzDfFlagCombinator -fuzztime=5s now executes ~430k iterations in 6s, vs ~0 effective coverage before. Documented the requirement in a comment so future readers don't strip it back out.

Two Codex items, both verified empirically:

1. P2 — `df -t TYPE` no longer overrides pseudo suppression.
   gdf 9.4 confirms: `df -t devfs` (devfs is pseudo on macOS) exits 1
   with "no file systems processed". GNU treats -t and the pseudo
   filter as independent — only -a exposes pseudo filesystems. My
   round-2 fix (which made includeSet bypass the pseudo check) was
   over-corrected; -t tmpfs continues to work because tmpfs isn't in
   our pseudoTypes table (RAM-backed but real storage), not because
   -t skips the pseudo filter.
   Removed the `else if !all && m.Pseudo` branch from
   makePreStatFilter so the pseudo check fires regardless of
   includeSet. Replaced TestPreStatFilter_TypeIncludeOverridesPseudoSuppression
   with TestPreStatFilter_TypeIncludeRespectsPseudoSuppression
   covering both the no-`-a` (drops) and `-a -t proc` (lists) cases.

2. P2 — Fuzz runner missed AllowAllCommands. interp.New defaults to
   "no commands allowed" (per interp/api.go), so every fuzz input
   starting with "df" was rejected as "command not allowed" before
   df ever ran. The new CI fuzz job for builtins/df was effectively
   a no-op.
   Added `interpoption.AllowAllCommands().(interp.RunnerOption)` to
   dfRunFuzz's interp.New options (matches testutil.RunScriptCtx).
   Verified: `go test -fuzz=FuzzDfFlagCombinator -fuzztime=5s` now
   executes ~430k iterations vs ~0 effective coverage before.
   Documented the requirement inline so future readers don't strip
   it back out.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e8940e9e9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go
// a long form would let scripts depend on rshell-only behavior.
registerUnitFlag(fs, &mode, unitsHuman1024, "human-readable", "h", "print sizes in powers of 1024 (e.g. 1023M)")
registerUnitFlag(fs, &mode, unitsHuman1000, "si", "H", "print sizes in powers of 1000 (e.g. 1.1G)")
kFlag := fs.VarPF(&unitFlag{target: &mode, value: unitsK}, "", "k", "use 1024-byte blocks (POSIX default)")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove bogus long name from -k help

When users run df --help or help df, this shorthand-only registration is rendered by pflag as -k, -- use 1024-byte blocks, advertising -- as if it were a long option even though -- is the end-of-options separator and the comment above says GNU df has no long form for -k. This makes the newly added builtin's help output incorrect; register -k without an empty long name or customize the help text for this flag.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified the bug: df --help was rendering -k, -- use 1024-byte blocks (POSIX default) because pflag.PrintDefaults treats the empty long name as a literal --. Fixed by marking the -k flag Hidden = true (so PrintDefaults skips it) and appending a manual -k use 1024-byte blocks (POSIX default) line in printHelp. The flag still parses correctly (-k overrides earlier -h/-H per the unitFlag last-wins logic) and df --kibibytes is still rejected. Added assertions in TestDfHelp to pin the new format and prevent regression: stdout must contain -k , must NOT contain -k, --, and must NOT contain --kibibytes.

Codex P2: pflag.PrintDefaults rendered the shorthand-only -k flag as
"-k, --                           use 1024-byte blocks (POSIX default)"
— treating the empty long name as a literal "--" string. That
advertised "--" as if it were a usable long option, which is wrong.

Fix: mark the -k Flag.Hidden = true so PrintDefaults skips it, and
append a manual "  -k                               use 1024-byte
blocks (POSIX default)" line in printHelp. Flag parsing is unchanged
— -k still overrides earlier -h/-H per the unitFlag last-wins logic,
and --kibibytes is still rejected as unknown.

TestDfHelp now asserts the new format: stdout must contain `-k `,
must NOT contain `-k, --`, and must NOT contain `--kibibytes`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbc8201121

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go Outdated
// -i, -h, or -H reverts to the default aligned column layout
// even though the POSIX header names (e.g. "Capacity") may stay.
human := mode == unitsHuman1024 || mode == unitsHuman1000
posixLayout := posix && !withType && !inodeMode && !human
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve GNU-aligned layout for df -P

When -P is the only format flag, this selects the printRows POSIX branch, which emits fields via strings.Join(..., " "); that makes rshell df -P single-space-separated. GNU coreutils documents -P as like the default format except one-line filesystem rows and POSIX header labels (https://www.gnu.org/software/coreutils/df), and the local GNU df -P output still pads/aligned columns, so bash-comparison scenarios or scripts expecting GNU-compatible df -P formatting will diverge even though this shell is supposed to match bash unless intentionally different.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified gdf -P / | head -1 | od -c shows multi-space padding (e.g. 5 spaces between Filesystem and 1024-blocks), not the strict single-space POSIX-spec layout. The GNU manual's -P only documents "one-line filesystem rows + POSIX header labels" — column spacing stays GNU-default aligned.

Dropped the single-space POSIX branch in printRows entirely; -P now goes through the same aligned-column path as every other format. -P still controls the header names (1024-blocks, Capacity) but no longer the row spacing. The unused posix/posixLayout parameter and its surrounding logic were removed; printRows is simpler.

Smoke verified df -P now matches gdf -P:

Filesystem     1024-blocks      Used Available Capacity Mounted on
/dev/disk3s1s1   971350180 539042388 432307792      56% /

Updated TestDfPosix and TestGNUCompatHeaderPosix to assert header words appear in order rather than byte-equality (since spacing now adapts to the longest filesystem name). Replaced TestGNUCompatPosixSingleSpace with TestGNUCompatPosixNoTabs — the no-tabs invariant is the only spacing-related thing that's actually true of GNU's -P.

Codex P2: gdf -P uses GNU's aligned column layout, not the strict
POSIX-spec single-space format. The GNU manual documents -P as
"one-line filesystem rows + POSIX header labels" — spacing stays the
same as the default format, only the column *labels* change
("1024-blocks", "Capacity"). Verified empirically: gdf -P / | od -c
shows ~5 spaces between Filesystem and 1024-blocks.

Removed the single-space branch in printRows entirely. -P now goes
through the same aligned-column path as every other format. The
posixLayout flag and its surrounding logic in writeOutput were dropped;
printRows lost a parameter.

Tests:
- TestDfPosix and TestGNUCompatHeaderPosix now assert header words
  appear in order rather than byte-equality, since column widths now
  adapt to the longest filesystem name.
- TestGNUCompatPosixSingleSpace renamed to TestGNUCompatPosixNoTabs
  and rewritten — the no-tabs invariant is the only spacing claim
  that's actually true of GNU's -P output.
- strings.Join removed from df's symbol allowlist (no longer used
  after the single-space path went away).

Smoke verified: df -P now matches gdf -P byte-for-byte:
  Filesystem     1024-blocks      Used Available Capacity Mounted on

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@julesmcrt
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 890e69547a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread builtins/df/df.go Outdated
// answers. Ceiling is computed as floor-plus-remainder-bump (rather than
// `(num + denom - 1) / denom`) because num can itself sit near MaxUint64.
func percentUsed(used, available uint64) string {
denom := saturatingAdd(used, available)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid saturating the percentage denominator

When --total accumulates very large filesystems, totalU and totalA can each saturate near MaxUint64; percentUsed then clamps used + available to MaxUint64 before scaling, so equal used/free totals such as two 50%-used huge mounts are reported as 100% instead of about 50%. The overflow guard needs to scale used and available together or otherwise compute the denominator without first losing the overflow information.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — verified the regression: percentUsed(MaxUint64, MaxUint64) was returning "100%" instead of "50%" because saturatingAdd clamped the denominator to MaxUint64 and lost the relative magnitudes. Same misreport happens for any --total accumulation that saturates both totalU and totalA.

Two-step scaling now:

  1. If used + available would wrap (used > ^uint64(0) - available), halve both inputs first. The percentage is invariant under scaling both sides equally, so we lose at most 1 bit of precision — far below the 1% rounding tolerance.
  2. The existing inner loop still shifts used and denom together to keep used*100 from overflowing.

Test additions:

  • {MaxU, MaxU} → "50%" (Codex's example).
  • {MaxU, MaxU/2} → "67%" (used:available = 2:1).

Both pass; full TestPercentUsed and TestPercentUsed_NoDivByZero still green.

Codex P2: when --total accumulates very large filesystems, totalU and
totalA can each saturate to MaxUint64. The previous percentUsed used
saturatingAdd to compute denom, which clamped to MaxUint64 and lost
the relative magnitudes — equal totals like (MaxU, MaxU) reported
"100%" instead of the true "50%".

Two-step scaling:
  1. If used + available would wrap (used > ^uint64(0) - available),
     halve both before summing. The percentage is invariant under
     scaling both sides equally, so at most 1 bit of precision is
     lost — far below the 1% rounding tolerance.
  2. The existing inner loop still shifts used and denom together to
     keep used*100 from overflowing.

TestPercentUsed gains two cases pinning the new behaviour:
  {MaxU, MaxU}      → "50%"
  {MaxU, MaxU/2}    → "67%"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

verified/analysis Human-reviewed static analysis changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant