Skip to content

fix: parse uid/gid as octal numbers per POSIX tar spec#62

Open
guoyangzhen wants to merge 1 commit into
unjs:mainfrom
guoyangzhen:fix/uid-gid-octal
Open

fix: parse uid/gid as octal numbers per POSIX tar spec#62
guoyangzhen wants to merge 1 commit into
unjs:mainfrom
guoyangzhen:fix/uid-gid-octal

Conversation

@guoyangzhen
Copy link
Copy Markdown

@guoyangzhen guoyangzhen commented Mar 21, 2026

Fixes #40

Problem

The POSIX tar specification stores uid and gid fields as octal ASCII strings. The code was using Number.parseInt() which defaults to base 10, causing values like 01751 (decimal 1001) to be parsed as 1751.

Fix

Replaced Number.parseInt(_readString(...)) with _readNumber(...) for both uid and gid fields. _readNumber already correctly parses with radix 8, matching the behavior used for size and mtime fields.

- const uid = Number.parseInt(_readString(buffer, offset + 108, 8));
+ const uid = _readNumber(buffer, offset + 108, 8);

- const gid = Number.parseInt(_readString(buffer, offset + 116, 8));
+ const gid = _readNumber(buffer, offset + 116, 8);

Summary by CodeRabbit

  • Bug Fixes
    • Improved accuracy of uid/gid field parsing in TAR archives.

Fixes unjs#40

The tar specification stores uid and gid as octal ASCII strings.
The code was using Number.parseInt() which defaults to base 10,
causing values like 0o1751 (decimal 1001) to be parsed as 1751.

Use _readNumber() which correctly parses with radix 8, matching
the behavior already used for size and mtime fields.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 21, 2026

📝 Walkthrough

Walkthrough

The TAR file parser's handling of uid and gid header fields was corrected to use octal byte parsing instead of string-based integer conversion, aligning with the TAR file format specification that requires these fields to be interpreted as octal values.

Changes

Cohort / File(s) Summary
TAR Header Field Parsing
src/parse.ts
Fixed uid and gid field parsing to use _readNumber() for direct octal byte interpretation instead of Number.parseInt() string conversion, resolving incorrect numeric values.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Poem

🐰 Whiskers twitching with delight,
Octal numbers parsed just right!
No more tangled string conversions made,
UID and GID in bytes, perfectly displayed!
Off by octal? Not anymore, we're clear,
The tar files hop with accuracy here!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: fixing uid/gid parsing to use octal numbers per POSIX tar spec, which directly addresses the changeset.
Linked Issues check ✅ Passed The code changes directly implement the fix requested in issue #40 by replacing Number.parseInt with _readNumber to parse uid/gid fields with radix 8.
Out of Scope Changes check ✅ Passed All changes are directly related to the uid/gid parsing fix specified in issue #40; no unrelated modifications are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/parse.ts (1)

57-60: Please add a regression test for octal uid/gid decoding.

This bug is easy to reintroduce; a fixture/assertion for values like 01751 (expected decimal 1001) would lock the behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/parse.ts` around lines 57 - 60, Add a regression test that constructs a
tar header buffer (or uses an existing fixture) with octal UID/GID strings like
"01751" and asserts that the parser decodes them to decimal 1001; specifically
exercise the code paths that call _readNumber so the uid and gid fields produced
by the parser (the uid and gid values derived from _readNumber) are compared
against expected numeric values. Ensure the test covers both uid and gid
decoding, includes leading/trailing NULs or spaces typical of tar headers, and
fails if _readNumber returns the octal string instead of the decimal numeric
value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/parse.ts`:
- Around line 57-60: Add a regression test that constructs a tar header buffer
(or uses an existing fixture) with octal UID/GID strings like "01751" and
asserts that the parser decodes them to decimal 1001; specifically exercise the
code paths that call _readNumber so the uid and gid fields produced by the
parser (the uid and gid values derived from _readNumber) are compared against
expected numeric values. Ensure the test covers both uid and gid decoding,
includes leading/trailing NULs or spaces typical of tar headers, and fails if
_readNumber returns the octal string instead of the decimal numeric value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8258f9b2-ab43-41e2-9a2a-37fc7f351a4d

📥 Commits

Reviewing files that changed from the base of the PR and between c70386b and 18ab4bb.

📒 Files selected for processing (1)
  • src/parse.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible bug in uid/gid value

1 participant