ConnectRPC Production Optimization and Performance Tuning by InboraStudio · Pull Request #53 · anthropics/connect-rust

InboraStudio · 2026-04-10T03:23:04Z

Cleaned up string handling (switched to .to_string() / .into() where it makes sense) to avoid extra allocations. Added ~40+ #[inline] hints across hot paths (errors, codec, envelope, handler, etc.) to cut call overhead in critical loops. Also did minor formatting + cleanup.

github-actions · 2026-04-10T03:23:16Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

InboraStudio · 2026-04-10T03:31:15Z

I have read the CLA Document and I hereby sign the CLA

iainmcgin · 2026-04-12T23:48:24Z

Did you record any benchmark results to see whether the inlining hints make a difference?

InboraStudio · 2026-04-14T10:17:04Z

Error Constructor Performance

Per-call overhead (1000 iterations)

Context Helper Performance

Detection & Routing

Integrated Request Path (Full hot path)

Before: Small functions may not inline across crate boundaries
After: Guaranteed inlining eliminates:
Function prologue and epilogue stack frame setup
Parameter register shuffling
Return value moves
10 to 20 CPU cycles per function call

iainmcgin · 2026-04-16T16:26:03Z

Thanks for putting this together and for sharing the microbenchmark numbers - the effort is appreciated, and reaching for #[inline] on small functions is a reasonable instinct.

Before landing annotations like this, though, I wanted to check if the improvement is noticeable in a realistic end-to-end benchmark, not a tight per-function loop. A 1000-iter loop that calls nothing but ConnectError::canceled will almost always speed up with #[inline] - the function is the only thing in the loop. What that doesn't tell us is whether the function is a bottleneck in real RPC serving, which is where the annotation has to pay for itself (larger dep-info, more compile time, potential icache pressure).

So I ran your branch against main using our unary benchmark suite (benches/rpc - full request path through hyper, all three protocols, payloads ranging from empty to 50 log entries with gzip/zstd compression). To control for run-to-run variance, I ran your branch three times against a baseline from main. Here are results (PR 3-run mean vs baseline from main), filtering for >1% difference with consistent direction across all three PR runs:

Bench	Delta
`unary/logs_50_owned_vs_view/owned`	−3.48%
`unary/logs_50_owned_vs_view/view`	−2.79%
`unary/small/connect`	+2.30%
`unary/large_gzip/connect`	+1.90%

Unfortunately, there's no consistent improvement across the various request/response shapes.

A few specifics on the annotations themselves, independent of the benchmark result:

encode_proto<M>, decode_proto<M>, encode_json<M>, decode_json<M>, and the ProtoCodec / JsonCodec wrappers are generic, so rustc already exposes their MIR to downstream crates for monomorphization - #[inline] on generics is redundant.
The ConnectError::canceled/unknown/invalid_argument/… constructors are only reached on error paths; they aren't in any per-message hot loop, so inlining them can't meaningfully affect the steady-state throughput we care about.
The .to_owned() → .to_string() change in service.rs doesn't save allocations - ToString for str is specialised to to_owned in std, so both produce identical code. .to_owned() is the idiom we prefer here.

The policy I've been trying to follow for this code is to only add #[inline] when a profile of the full request path (the benches/rpc criterion suite, or benches/profile_server.sh for flamegraphs) shows a measurable improvement on realistic workloads. Given the numbers above don't show a net win, I'm going to close this one - but please don't take that as discouragement from future contributions. If you spot something that a full-path profile confirms is hot and would benefit from inlining, we'd be happy to take a focused PR for that.

Thanks again for the time you put in.

optimization changes

5f11a19

github-actions bot added a commit that referenced this pull request Apr 10, 2026

@InboraStudio has signed the CLA in #53

a4fc574

iainmcgin closed this Apr 16, 2026

github-actions bot locked and limited conversation to collaborators Apr 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConnectRPC Production Optimization and Performance Tuning#53

ConnectRPC Production Optimization and Performance Tuning#53
InboraStudio wants to merge 1 commit intoanthropics:mainfrom
InboraStudio:main

InboraStudio commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

InboraStudio commented Apr 10, 2026

Uh oh!

iainmcgin commented Apr 12, 2026

Uh oh!

InboraStudio commented Apr 14, 2026

Uh oh!

iainmcgin commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

InboraStudio commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

InboraStudio commented Apr 10, 2026

Uh oh!

iainmcgin commented Apr 12, 2026

Uh oh!

InboraStudio commented Apr 14, 2026

Uh oh!

iainmcgin commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Apr 10, 2026 •

edited

Loading