Hi All, I am trying to upgrade from Graphiti 1.7.9 to 1.10.2 but noticing some potential performance issues in our NewRelic graphs for one of our primary services. I tried to upgrade a couple of months ago but reverted in production due to this behavior. The attached graphs are from our staging server with little traffic compared to production but the behaviors are the same. In the graphs right about 3:58 is when the dynos running Graphiti 1.10.2 went back to 1.7.9. As you can see throughput went back up, object allocations dropped and gc frequency dropped.
Graphs for 1.10.2 back to 1.7.9
I did an analysis with Claude Code and was able to generate a patch that seemed to have fixed the issue. Here is the analysis:
CLAUDE CODE
Significant per-request allocation regression when concurrency = false (1.7.9 → 1.10.2)
Summary
Since the Concurrent::Promises rewrite introduced in 1.8.0 (#472), every request allocates promise machinery — Concurrent::Promises::Future, then_on / zip_futures_on / rescue_on chains, and per-sideload Thread-storage and Fiber-storage Hash snapshots — even when Graphiti.config.concurrency is false (the default).
For applications that don't enable concurrency, this is pure overhead: the allocations exist solely to drive a thread pool that's intentionally configured as synchronous (max_threads: 0, synchronous: true). On a Rails app under production load, this measurably increases GC frequency.
Reproduction
A small Rails JSON:API service hitting /v1/some_endpoint?include=some_object (one sideload, two records) with stock config (concurrency = false):
| Configuration |
Total allocations / request |
concurrent-ruby memory / 20 reqs |
| graphiti 1.7.9 |
10,369 |
6.40 kB |
| graphiti 1.10.2 (stock) |
10,666 (+297, +2.9%) |
309.60 kB |
Per-request, per-sideload, the new code in Scope#future_with_context (lib/graphiti/scope.rb:137-161) unconditionally allocates:
- a
Hash snapshot of every Thread.current.keys entry
- a
Hash snapshot of every Fiber.current.storage entry (added in 1.8.2 / #497)
- a
Concurrent::Promises.future_on(...) future + closure
- nested
with_thread_locals / with_fiber_locals tracker arrays
- a
Rails.application.executor.wrap block
Plus, per request, Scope#future_resolve and future_resolve_sideloads add a then_on / zip_futures_on / rescue_on chain. None of this work is useful when concurrency is off — GLOBAL_THREAD_POOL_EXECUTOR is already configured as synchronous: true, max_threads: 0.
The cost scales linearly with sideload count per request — a heavier endpoint with many ?include= relationships pays the multiplier.
In production NewRelic data on a Rails 7 service at moderate traffic, this manifests as a visible GC frequency increase between staging (1.10.2) and production (1.7.9).
Proposed solution
Branch on Graphiti.config.concurrency at the public entry points (Scope#resolve, Scope#resolve_sideloads, Sideload#load, PolymorphicBelongsTo#resolve) and dispatch to a synchronous path when concurrency is disabled. The synchronous path mirrors 1.7.9 semantics — no Future allocation, no thread/fiber storage snapshots — and reuses 1.10.2 helpers (broadcast_data, assign_serializer, before_resolve, after_resolve callbacks).
The future_* methods remain unchanged; users with concurrency = true get the existing 1.8+ behavior unmodified.
Sketch of the change in Scope:
def resolve(&block)
return sync_resolve(&block) unless Graphiti.config.concurrency
future_resolve.value!
end
def resolve_sideloads(results)
return sync_resolve_sideloads(results) unless Graphiti.config.concurrency
future_resolve_sideloads(results).value!
end
private
def sync_resolve
return [] if @query.zero_results?
resolved = broadcast_data { |payload|
@object = @resource.before_resolve(@object, @query)
payload[:results] = @resource.resolve(@object)
payload[:results]
}
resolved.compact!
assign_serializer(resolved)
yield resolved if block_given?
@opts[:after_resolve]&.call(resolved)
sync_resolve_sideloads(resolved) unless @query.sideloads.empty?
resolved
end
def sync_resolve_sideloads(results)
return if results == []
@query.sideloads.each_pair do |name, q|
sideload = @resource.class.sideload(name)
next if sideload.nil? || sideload.shared_remote?
Graphiti.config.before_sideload&.call(Graphiti.context)
sideload.resolve(results, q, @resource)
end
end
Sideload#load short-circuits to build_resource_proxy(...).to_a, and Sideload#resolve mirrors future_resolve against the sync Scope#resolve. PolymorphicBelongsTo#resolve does the same recursive group-by without Concurrent::Promises.zip.
Verification
Local fork against the same Rails app:
| Configuration |
Total allocations / request |
concurrent-ruby memory / 20 reqs |
| graphiti 1.7.9 |
10,369 |
6.40 kB |
| graphiti 1.10.2 stock |
10,666 (+297) |
309.60 kB |
| graphiti 1.10.2 + sync-dispatch fork |
10,365 (-4) |
6.40 kB |
Allocation profile is statistically indistinguishable from 1.7.9 with concurrency off; concurrent-ruby returns to the 6.40 kB baseline (the executor-delay constant, never resolved).
Test results on the fork:
- graphiti's own suite: 1372 examples, 0 failures (existing concurrency-off scope_specs needed updates from
expect(sideload).to receive(:future_resolve) → expect(sideload).to receive(:resolve) to reflect the new dispatch).
- Consuming Rails 7 app: 1163 examples, 0 failures, 100% line coverage.
Notes / open questions
- The
concurrency = true path is untouched — this is purely additive for the off path. Existing users with concurrency enabled see no behavior change.
- The 1.8.2 motivation for snapshotting Thread/Fiber storage (#497 "prevent context loss") still holds for the concurrent path. With concurrency off, no thread hop occurs, so there's nothing to preserve.
Environment
- Ruby 4.0.3
- Rails 7.2.3
- graphiti 1.10.2
- concurrent-ruby 1.3.6
- jsonapi-renderer 0.2.2
- dry-types 1.9.1
- graphiti_errors 1.1.2
Hi All, I am trying to upgrade from Graphiti 1.7.9 to 1.10.2 but noticing some potential performance issues in our NewRelic graphs for one of our primary services. I tried to upgrade a couple of months ago but reverted in production due to this behavior. The attached graphs are from our staging server with little traffic compared to production but the behaviors are the same. In the graphs right about 3:58 is when the dynos running Graphiti 1.10.2 went back to 1.7.9. As you can see throughput went back up, object allocations dropped and gc frequency dropped.
Graphs for 1.10.2 back to 1.7.9
I did an analysis with Claude Code and was able to generate a patch that seemed to have fixed the issue. Here is the analysis:
CLAUDE CODE
Significant per-request allocation regression when
concurrency = false(1.7.9 → 1.10.2)Summary
Since the Concurrent::Promises rewrite introduced in 1.8.0 (#472), every request allocates promise machinery —
Concurrent::Promises::Future,then_on/zip_futures_on/rescue_onchains, and per-sideload Thread-storage and Fiber-storage Hash snapshots — even whenGraphiti.config.concurrencyisfalse(the default).For applications that don't enable concurrency, this is pure overhead: the allocations exist solely to drive a thread pool that's intentionally configured as synchronous (
max_threads: 0, synchronous: true). On a Rails app under production load, this measurably increases GC frequency.Reproduction
A small Rails JSON:API service hitting
/v1/some_endpoint?include=some_object(one sideload, two records) with stock config (concurrency = false):Per-request, per-sideload, the new code in
Scope#future_with_context(lib/graphiti/scope.rb:137-161) unconditionally allocates:Hashsnapshot of everyThread.current.keysentryHashsnapshot of everyFiber.current.storageentry (added in 1.8.2 / #497)Concurrent::Promises.future_on(...)future + closurewith_thread_locals/with_fiber_localstracker arraysRails.application.executor.wrapblockPlus, per request,
Scope#future_resolveandfuture_resolve_sideloadsadd athen_on/zip_futures_on/rescue_onchain. None of this work is useful when concurrency is off —GLOBAL_THREAD_POOL_EXECUTORis already configured assynchronous: true, max_threads: 0.The cost scales linearly with sideload count per request — a heavier endpoint with many
?include=relationships pays the multiplier.In production NewRelic data on a Rails 7 service at moderate traffic, this manifests as a visible GC frequency increase between staging (1.10.2) and production (1.7.9).
Proposed solution
Branch on
Graphiti.config.concurrencyat the public entry points (Scope#resolve,Scope#resolve_sideloads,Sideload#load,PolymorphicBelongsTo#resolve) and dispatch to a synchronous path when concurrency is disabled. The synchronous path mirrors 1.7.9 semantics — no Future allocation, no thread/fiber storage snapshots — and reuses 1.10.2 helpers (broadcast_data,assign_serializer,before_resolve,after_resolvecallbacks).The
future_*methods remain unchanged; users withconcurrency = trueget the existing 1.8+ behavior unmodified.Sketch of the change in
Scope:Sideload#loadshort-circuits tobuild_resource_proxy(...).to_a, andSideload#resolvemirrorsfuture_resolveagainst the syncScope#resolve.PolymorphicBelongsTo#resolvedoes the same recursive group-by withoutConcurrent::Promises.zip.Verification
Local fork against the same Rails app:
Allocation profile is statistically indistinguishable from 1.7.9 with concurrency off;
concurrent-rubyreturns to the 6.40 kB baseline (the executor-delay constant, never resolved).Test results on the fork:
expect(sideload).to receive(:future_resolve)→expect(sideload).to receive(:resolve)to reflect the new dispatch).Notes / open questions
concurrency = truepath is untouched — this is purely additive for the off path. Existing users with concurrency enabled see no behavior change.Environment