Skip to content

[iris] Return raw memray .bin traces and enable native tracking#4415

Merged
rjpower merged 1 commit intomainfrom
claude/sharp-meninsky
Apr 4, 2026
Merged

[iris] Return raw memray .bin traces and enable native tracking#4415
rjpower merged 1 commit intomainfrom
claude/sharp-meninsky

Conversation

@rjpower
Copy link
Copy Markdown
Collaborator

@rjpower rjpower commented Apr 4, 2026

Add RAW format to MemoryProfile that returns the .bin trace file directly,
skipping the server-side transform step. Users run any memray reporter
locally (flamegraph, tree, stats, speedscope). Dashboard MEM button now
downloads .bin by default. Also enables --native on all memray attach
calls so C/Rust allocations (e.g. HuggingFace tokenizers) are visible.

Add RAW format to MemoryProfile that returns the .bin trace file directly,
skipping the server-side transform step. Users run any memray reporter
locally (flamegraph, tree, stats, speedscope). Dashboard MEM button now
downloads .bin by default. Also enables --native on all memray attach
calls so C/Rust allocations (e.g. HuggingFace tokenizers) are visible.
@rjpower rjpower added the agent-generated Created by automation/agent label Apr 4, 2026
@rjpower rjpower requested a review from yonromai April 4, 2026 03:50
@rjpower rjpower enabled auto-merge (squash) April 4, 2026 03:50
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 4, 2026

Claude finished @rjpower's task in 1m 35s —— View job


PR Review: [iris] Return raw memray .bin traces and enable native tracking

  • Gather context and read changed files
  • Review code changes
  • Post review feedback

Clean, well-structured PR. The RAW format addition follows the existing pattern nicely, and the early-return approach in all three runtimes (docker, process, k8s) is the right way to skip the transform step. A few items to consider:

Minor Issues

  1. MemoryProfileSpec.reporter docstring is stale (profile.py:59): The inline comment says "flamegraph", "table", "stats" but should now include "raw".

  2. --native + --aggregate interaction (profile.py:121-123): When leaks=True, the command builds memray attach --native ... --aggregate. Memray's --aggregate mode does not record individual allocations and native tracking with aggregation can be problematic. Worth verifying this combination works as expected, or documenting that --native is ignored when --aggregate is active.

  3. RAW stub returns empty bytes (process.py:358-359): The fallback stub for RAW returns b"", which is an invalid memray trace file. Other stubs return human-readable error messages. Consider returning a short message like b"memray unavailable in local mode\n" for consistency — though I understand an empty file might be preferred since the .bin format is binary and users will try to open it with memray tooling anyway.

Looks Good

  • Early returns in all three runtime paths (docker, process, k8s) correctly skip the transform step for RAW.
  • Cleanup in finally blocks still runs correctly after the early return — read_bytes() completes before finally deletes the temp file.
  • The .bin extension in the dashboard download filename is a nice touch for UX.
  • native_traces=True in the in-process memray.Tracker call (profile.py:224) is consistent with the --native flag on the attach command.
  • Tests cover the new format, is_raw property, and --native flag presence.
  • Proto enum value RAW = 4 avoids collision with existing values.

Overall this is a clean change. The stale docstring is the only thing I'd fix before merging.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 615f1db126

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

cluster_pb2.MemoryProfile.FLAMEGRAPH: ("flamegraph", "html"),
cluster_pb2.MemoryProfile.TABLE: ("table", "txt"),
cluster_pb2.MemoryProfile.STATS: ("stats", "json"),
cluster_pb2.MemoryProfile.RAW: ("raw", "bin"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Regenerate protobuf stubs before using MemoryProfile.RAW

This new enum reference is currently unusable because the generated bindings were not updated with the proto change: cluster_pb2.MemoryProfile in lib/iris/src/iris/rpc/cluster_pb2.py still has only FLAMEGRAPH, TABLE, and STATS. As a result, importing iris.cluster.runtime.profile now raises AttributeError: RAW, which breaks any path that imports runtime profiling (including controller/test startup). Please regenerate and commit the protobuf outputs alongside cluster.proto.

Useful? React with 👍 / 👎.

function buildProfileType(profilerType: ProfilerType): Record<string, unknown> {
if (profilerType === 'cpu') return { cpu: { format: 'SPEEDSCOPE' } }
if (profilerType === 'memory') return { memory: { format: 'FLAMEGRAPH' } }
if (profilerType === 'memory') return { memory: { format: 'RAW' } }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Decode raw profile bytes before constructing download Blob

Changing memory profiling to RAW makes profileData a binary memray trace, but the download path still turns base64 into a JS string (atob) and then builds Blob([decoded]). That string is UTF-8 encoded by Blob, which corrupts non-ASCII bytes, so the downloaded .bin can be invalid for memray reporters. Convert base64 to a Uint8Array (or otherwise keep bytes binary-safe) before creating the Blob in this flow and the mirrored JobDetail handler.

Useful? React with 👍 / 👎.

@rjpower rjpower merged commit 2e3ab86 into main Apr 4, 2026
50 of 51 checks passed
@rjpower rjpower deleted the claude/sharp-meninsky branch April 4, 2026 16:00
@rjpower
Copy link
Copy Markdown
Collaborator Author

rjpower commented Apr 4, 2026

@claude can you check this PR comment and make a fix PR as needed:

Changing memory profiling to RAW makes profileData a binary memray trace, but the download path still turns base64 into a JS string (atob) and then builds Blob([decoded]). That string is UTF-8 encoded by Blob, which corrupts non-ASCII bytes, so the downloaded .bin can be invalid for memray reporters. Convert base64 to a Uint8Array (or otherwise keep bytes binary-safe) before creating the Blob in this flow and the mirrored JobDetail handler.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 4, 2026

Claude finished @rjpower's task in 2m 44s —— View jobclaude/pr-4415-20260404-1605Create PR ➔


Fix binary-safe base64 decoding for RAW memray downloads

  • Read affected files (useProfileAction.ts and JobDetail.vue)
  • Implement binary-safe base64→Uint8Array conversion
  • Run pre-commit checks — all passed
  • Commit and push fix

Summary: The atob()new Blob([string]) pattern corrupts binary data because Blob UTF-8 encodes the string, mangling any byte > 127. Fixed both download paths to convert base64 → Uint8Array via charCodeAt(), keeping raw bytes intact.

Changes:

  • useProfileAction.ts: Extracted a base64ToBytes() helper; binary downloads (cpu/memory) now use Uint8Array, while thread dumps (text) still use atob() for string display.
  • JobDetail.vue: Inline Uint8Array conversion replacing the atob string path.

Testing: Pre-commit checks pass. No backend Python changes, so no pytest needed.

rjpower added a commit that referenced this pull request Apr 4, 2026
Fix binary data corruption in memray .bin downloads. The download path
used `atob()` → `Blob([string])`, which UTF-8 encodes the string and
corrupts non-ASCII bytes. Now uses `Uint8Array` to preserve raw bytes.

Followup to #4415.
Helw150 pushed a commit that referenced this pull request Apr 8, 2026
Add RAW format to MemoryProfile that returns the .bin trace file
directly,
skipping the server-side transform step. Users run any memray reporter
locally (flamegraph, tree, stats, speedscope). Dashboard MEM button now
downloads .bin by default. Also enables --native on all memray attach
calls so C/Rust allocations (e.g. HuggingFace tokenizers) are visible.
Helw150 pushed a commit that referenced this pull request Apr 8, 2026
Fix binary data corruption in memray .bin downloads. The download path
used `atob()` → `Blob([string])`, which UTF-8 encodes the string and
corrupts non-ASCII bytes. Now uses `Uint8Array` to preserve raw bytes.

Followup to #4415.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-generated Created by automation/agent

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants