feat: zero-copy string access via ValueView and allocation-reuse APIs by bartlomieju · Pull Request #1927 · denoland/rusty_v8

bartlomieju · 2026-03-11T22:02:29Z

Summary

Add ValueView::as_str() — true zero-copy &str for ASCII strings (no alloc, no copy)
Add ValueView::to_cow_lossy() — zero-copy Cow::Borrowed for ASCII, transcoded Cow::Owned for Latin-1/UTF-16
Add String::write_utf8_into() — write UTF-8 into an existing String, reusing its allocation
Add public latin1_to_utf8() — SIMD-friendly Latin-1→UTF-8 transcoder (8-byte bulk processing)
Rewrite to_rust_cow_lossy() to use ValueView internally — eliminates utf8_length pre-scan

Analysis

Problem

Currently, accessing V8 string contents from Rust almost always requires a memory allocation + copy. The main methods are:

Method	Allocates?	Copies?	Returns
`ValueView::data()`	No	No	`&[u8]` or `&[u16]` (raw encoding)
`to_rust_string_lossy()`	Always	Always	`String`
`to_rust_cow_lossy()`	Sometimes	Always (writes to buffer)	`Cow<str>`

to_rust_string_lossy always calls alloc::alloc() + copy. to_rust_cow_lossy did two passes over the string data (utf8_length pre-scan + write_*), and still copies even when borrowing into a stack buffer.

In deno_core, the hot path (runtime/ops.rs::to_str()) uses an 8KB stack buffer with to_rust_cow_lossy, which avoids heap allocation for small strings but still did two passes and a copy.

How ValueView changes the game

ValueView (V8's v8::String::ValueView) flattens the string once and gives a direct pointer into V8's heap. For one-byte ASCII strings (the vast majority in practice — identifiers, property names, URLs, JSON keys), the bytes are already valid UTF-8, enabling true zero-copy access.

New APIs

ValueView::as_str() -> Option<&str> — Zero-copy for ASCII one-byte strings. Returns None for Latin-1 non-ASCII or two-byte strings.

ValueView::to_cow_lossy() -> Cow<'_, str> — Zero-copy Borrowed for ASCII, single-pass transcode for everything else:

ASCII one-byte → Cow::Borrowed(&str) — no alloc, no copy
Latin-1 non-ASCII → Cow::Owned via latin1_to_utf8 (one pass, SIMD-friendly)
Two-byte (UTF-16) → Cow::Owned via from_utf16_lossy (one pass)

String::write_utf8_into(&self, scope, buf: &mut String) — Clears and fills an existing String, reusing its heap allocation. Enables patterns like thread-local reusable buffers with zero malloc after warmup.

latin1_to_utf8(len, inbuf, outbuf) -> usize — Public utility for Latin-1→UTF-8 transcoding. Processes 8 bytes at a time with a single bitmask check (& 0x8080_8080_8080_8080), bulk-copying ASCII chunks and expanding non-ASCII bytes to 2-byte UTF-8 sequences. Previously this logic was duplicated in deno_core.

`to_rust_cow_lossy` rewrite

The existing to_rust_cow_lossy has been rewritten to use ValueView internally. Instead of calling utf8_length() (FFI pre-scan) + write_utf8_uninit_v2() (FFI copy), it now:

Creates a ValueView (1 FFI call — flattens string, returns direct pointer)
Matches on encoding:
- One-byte ASCII: memcpy into stack buffer (1 pass)
- One-byte Latin-1: latin1_to_utf8 into stack buffer (1 pass)
- Two-byte: Direct UTF-16→UTF-8 transcode into stack buffer via char::decode_utf16 (1 pass, no intermediate allocation)

This reduces from 2 FFI calls + 2 passes to 1 FFI call + 1 pass for the common one-byte case.

Full optimization roadmap

This PR implements items A, B, C, and G. The remaining items (D–F) are follow-up work in deno_core.

	Optimization	Layer	Effort	Impact	Status
A	`ValueView::as_str/to_cow_lossy`	rusty_v8	Low	High	✅ Done
B	`String::write_utf8_into(&mut String)`	rusty_v8	Low	Medium	✅ Done
C	`to_rust_cow_lossy` via ValueView internally	rusty_v8	Medium	High	✅ Done
D	Thread-local reusable `String` buffer	deno_core	Low	Medium	Follow-up
E	Buffer pool for owned strings	deno_core	High	Medium	Follow-up
F	Replace `to_str()` with ValueView-based path	deno_core	Low	High	Follow-up
G	`latin1_to_utf8` in rusty_v8	rusty_v8	Low	Medium	✅ Done

Details on follow-up items

D. Thread-local reusable String buffer — The simplest high-impact change for to_string() / to_rust_string_lossy() callers in deno_core. Keep a thread-local String with a warm allocation, use write_utf8_into to fill it, avoiding malloc in steady state.

E. Buffer pool for owned strings — For cases where ownership is truly needed (the string escapes the current scope), a pool of pre-allocated Vec<u8> buffers. Requires a custom string type that returns to the pool on drop. More invasive but eliminates malloc/free from the hot path entirely.

F. to_str() via ValueView in deno_core — The current to_str() in runtime/ops.rs could use ValueView + to_cow_lossy() instead of to_rust_cow_lossy with an 8KB stack buffer. This eliminates the stack buffer for ASCII strings (the common case) and removes the utf8_length pre-scan for all strings. Caveat: ValueView borrows &mut Isolate, so the returned Cow can't outlive the view — works for #[string] s: &str but not #[string] s: String.

…APIs Add ValueView::as_str() for true zero-copy &str access to ASCII strings, ValueView::to_cow_lossy() for zero-copy-when-possible string conversion, String::write_utf8_into() for allocation reuse, and a public latin1_to_utf8 SIMD-friendly transcoder utility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Eliminates the utf8_length pre-scan by using ValueView for direct access to string contents. For one-byte strings this reduces from 2 FFI calls + 2 passes to 1 FFI call + 1 pass. Latin-1 transcoding uses latin1_to_utf8, and two-byte strings are transcoded directly into the stack buffer without an intermediate allocation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ow checker The nightly Rust compiler correctly rejects borrowing a by-value Local<'s, String> parameter since &*string creates a reference to the stack-local copy that is dropped at end of function. We recover the 's lifetime via pointer cast, which is safe because Local<'s, _> guarantees the V8 string is rooted in a HandleScope that lives for at least 's. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kajukitli

lgtm

the zero-copy ValueView::as_str() / to_cow_lossy() additions make sense, and rewriting to_rust_cow_lossy() to use ValueView internally is the right optimization. the public latin1_to_utf8() helper also seems reasonable since the logic was already duplicated downstream.

one minor concern: write_utf8_into() still calls utf8_length() up front, so it's not fully on the one-pass path. that's probably fine because it needs to reserve capacity and reuse the existing String, but worth keeping in mind if the goal is to squeeze every last FFI call out of the hot path.

kajukitli

lgtm

the zero-copy ValueView::as_str() / to_cow_lossy() additions make sense, and rewriting to_rust_cow_lossy() to use ValueView internally is the right optimization. the public latin1_to_utf8() helper also seems reasonable since the logic was already duplicated downstream.

one minor concern: write_utf8_into() still calls utf8_length() up front, so it's not fully on the one-pass path. that's probably fine because it needs to reserve capacity and reuse the existing String, but worth keeping in mind if the goal is to squeeze every last FFI call out of the hot path.

Rewrite write_utf8_into to use ValueView internally, eliminating the separate utf8_length() FFI call. Now matches the same single-pass pattern used by to_rust_cow_lossy. The signature changes from &Isolate to &mut Isolate to satisfy ValueView's requirements; all existing callers pass &mut HandleScope which derefs to &mut Isolate, so this is source-compatible. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

bartlomieju and others added 3 commits March 11, 2026 23:01

bartlomieju requested review from devsnek, littledivy and nathanwhit March 11, 2026 22:45

kajukitli approved these changes Mar 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: zero-copy string access via ValueView and allocation-reuse APIs#1927

feat: zero-copy string access via ValueView and allocation-reuse APIs#1927
bartlomieju wants to merge 4 commits intomainfrom
feat/string-view-optimizations

bartlomieju commented Mar 11, 2026 •

edited

Loading

Uh oh!

kajukitli left a comment

Uh oh!

kajukitli left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bartlomieju commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Analysis

Problem

How ValueView changes the game

New APIs

to_rust_cow_lossy rewrite

Full optimization roadmap

Details on follow-up items

Uh oh!

kajukitli left a comment

Choose a reason for hiding this comment

Uh oh!

kajukitli left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bartlomieju commented Mar 11, 2026 •

edited

Loading

`to_rust_cow_lossy` rewrite