Skip to content

Fix #12007: RLog heap buffer#25411

Open
seifreed wants to merge 3 commits intoradareorg:masterfrom
seifreed:codex/issue-12007-rlog-heap-buffer
Open

Fix #12007: RLog heap buffer#25411
seifreed wants to merge 3 commits intoradareorg:masterfrom
seifreed:codex/issue-12007-rlog-heap-buffer

Conversation

@seifreed
Copy link
Contributor

Work in progress for #12007 (RLog improvements).

This PR addresses the first checklist item: avoid stack-fixed buffer usage in r_log_vmessage.

Changes:

  • Use r_str_newvf() to allocate the formatted log message on the heap, preventing truncation.
  • Add a unit test to ensure long log messages are not truncated.

Refs #12007

@trufae
Copy link
Collaborator

trufae commented Feb 12, 2026

i think its not worth to do this change because it will introduce more weight to the log apis, but i can do some profiling if you just rebase the branch on top of master and remove the unrelated commits to the pr

Work in progress for radareorg#12007.

- Avoid stack-fixed buffer in r_log_vmessage by allocating the formatted message on the heap (r_str_newvf), preventing truncation.
- Add a unit test to ensure long log messages are not truncated.
@seifreed seifreed force-pushed the codex/issue-12007-rlog-heap-buffer branch from fce9493 to ba5c563 Compare February 12, 2026 17:40
@seifreed
Copy link
Contributor Author

I rebased the branch on top of current master and removed unrelated commits (the PR now contains only the RLog change + unit test).

Use a small stack buffer for the common case and only allocate on the heap when the formatted message does not fit, preventing truncation without adding per-log heap overhead.
@seifreed
Copy link
Contributor Author

Rebased on top of current master and dropped unrelated commits (PR now only touches RLog + unit test).

To address the 'more weight' concern: the implementation now uses a small stack buffer for the common case and only falls back to heap allocation when the formatted message doesn't fit. This avoids truncation while keeping the fast path allocation-free; heap is only used for long messages.

@trufae
Copy link
Collaborator

trufae commented Feb 12, 2026

augmented code review with opus 4.6:

PR #25411 Review: RLog heap buffer fallback

PR: #25411
Author: seifreed
Refs: Issue #12007 (opened 2018, "Improvements to the RLog API")


What the PR does

Changes r_log_vmessage() to avoid silently truncating log messages longer than 512 bytes. Currently the function uses a fixed char out[512] stack buffer and vsnprintf(out, 512, fmt, ap), which truncates anything longer without warning.

The PR adds a "try stack, fallback to heap" pattern:

  1. va_copy(ap2, ap) then vsnprintf(out, 512, fmt, ap2) to try the stack buffer
  2. If msglen >= 512, calls r_str_newvf(fmt, ap) to heap-allocate
  3. Uses a msg pointer that targets either out or the heap string
  4. Frees the heap string in two return paths (callback early-return + normal path)
  5. Adds a unit test with a 600-byte message

Performance impact analysis

Fast path (messages < 512 bytes — the vast majority):

  • Added cost: one va_copy (struct copy), one va_end, one integer comparison, one pointer assignment
  • Verdict: negligible. This is dwarfed by what r_log_vmessage already does (string formatting, r_strbuf_new, r_strbuf_appendf ×3-5, fprintf, r_strbuf_drain, free)

Slow path (messages ≥ 512 bytes):

  • Format string is parsed 3 times: once for stack try, once inside r_str_newvf for size measurement (vsnprintf(NULL,0,...)), once for actual heap write
  • Plus a calloc + free
  • Verdict: moderate but irrelevant — messages this long are extremely rare

Overall: merging this would NOT measurably slow anything down. The logging path already does heap allocation (r_strbuf_new, r_strbuf_drain) on every call. The added fast-path cost is a struct copy and a comparison.


Code quality issues

  1. Triple format parsing in slow path. The first vsnprintf writes 512 bytes to the stack buffer that gets thrown away, then r_str_newvf internally does vsnprintf(NULL,0,...) + vsnprintf(heap,...). Could be reduced to 2 passes by measuring first with vsnprintf(NULL, 0, ...) and only then deciding stack vs heap.

  2. Two separate free(msg) sites. The if (msg != out) { free(msg); } appears in both the callback early-return and the normal exit. If someone later adds another return path, they'll likely forget the free. A single cleanup label or restructuring would be safer.

  3. NULL check on r_str_newvf return is correct — unlike R_NEW/R_NEW0, r_str_newvf uses calloc which can return NULL.

  4. va_copy/va_end usage is correctap2 is consumed by first vsnprintf, original ap is preserved for fallback.

  5. Test checks r_core_new() for NULL — reasonable in a test context even though it practically won't fail.


Do we really need this change?

Arguments FOR:

  • Fixes a real (if rare) silent truncation bug
  • Fast-path overhead is essentially zero
  • The "try stack, fallback to heap" pattern is well-understood and safe
  • Makes the log system correct for all message sizes

Arguments AGAINST:

  • In ~8 years since issue Improvements to the RLog API #12007 was opened, nobody has hit this in practice
  • Log messages > 512 bytes are vanishingly rare in the codebase — all R_LOG_* calls format short strings (addresses, filenames, one-line errors)
  • Simply increasing the buffer to 1024 or 2048 would solve the same problem with zero code complexity and zero overhead
  • Adds code surface area to a function that ideally should be trivial

Verdict: The change is technically correct but over-engineered for the problem it solves. If truncation is a concern, bumping char out[512] to char out[2048] achieves the same practical result with zero overhead, zero new code paths, and zero risk of memory leaks. A 2048-byte log message is already absurdly long — any message exceeding that is almost certainly a programmer error, not something that should be silently heap-allocated.


Suggested alternatives (simplest → most complex)

  1. Just bump the bufferchar out[2048] — zero overhead, zero complexity, solves the realistic cases
  2. Measure-first approach — do vsnprintf(NULL, 0, fmt, ap_copy) first to get the length, then decide stack vs heap. This avoids the wasted 512-byte write in the slow path and reduces format parsing from 3× to 2×.
  3. Accept the PR as-is — it works, the overhead is negligible, but it adds unnecessary complexity

Recommendation: Option 1 is the pragmatic choice. If you want to be "correct" for arbitrarily long messages, Option 2 is cleaner than the current PR.

Replace the stack+heap fallback with a fixed-size stack buffer (2048 bytes) to keep r_log_vmessage simple and low-overhead while covering realistic long messages.
@seifreed
Copy link
Contributor Author

Applied the simplification.

Changes:

  • Removed heap fallback logic from r_log_vmessage.
  • Increased fixed stack buffer from 512 to 2048.

Rationale:

  • Keep the logging path minimal and predictable (no extra allocation path).
  • Cover realistic long log messages while avoiding additional runtime/code complexity.

Validation:

  • make -j
  • test/unit/bin/test_util

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants