Perf issues upgrading from 1.7.9 -> 1.10.2

Hi All, I am trying to upgrade from Graphiti 1.7.9 to 1.10.2 but noticing some potential performance issues in our NewRelic graphs for one of our primary services. I tried to upgrade a couple of months ago but reverted in production due to this behavior. The attached graphs are from our staging server with little traffic compared to production but the behaviors are the same. In the graphs right about 3:58 is when the dynos running Graphiti 1.10.2 went back to 1.7.9. As you can see throughput went back up, object allocations dropped and gc frequency dropped.

## Graphs for 1.10.2 back to 1.7.9

<img width="704" height="344" alt="Image" src="https://github.com/user-attachments/assets/95265e43-62b7-4b20-8585-a241053f878b" />

<img width="714" height="340" alt="Image" src="https://github.com/user-attachments/assets/acbd6460-de92-45b7-9f17-38009c0f40b8" />

<img width="701" height="341" alt="Image" src="https://github.com/user-attachments/assets/11ce6f6f-c52c-4f61-a8fb-2515350ceacb" />

I did an analysis with Claude Code and was able to generate a patch that seemed to have fixed the issue. Here is the analysis:

# CLAUDE CODE

# Significant per-request allocation regression when `concurrency = false` (1.7.9 → 1.10.2)

## Summary

Since the Concurrent::Promises rewrite introduced in 1.8.0 ([#472](https://github.com/graphiti-api/graphiti/pull/472)), every request allocates promise machinery — `Concurrent::Promises::Future`, `then_on` / `zip_futures_on` / `rescue_on` chains, and per-sideload Thread-storage and Fiber-storage Hash snapshots — even when `Graphiti.config.concurrency` is `false` (the default).

For applications that don't enable concurrency, this is pure overhead: the allocations exist solely to drive a thread pool that's intentionally configured as synchronous (`max_threads: 0, synchronous: true`). On a Rails app under production load, this measurably increases GC frequency.

## Reproduction

A small Rails JSON:API service hitting `/v1/some_endpoint?include=some_object` (one sideload, two records) with stock config (`concurrency = false`):

| Configuration | Total allocations / request | concurrent-ruby memory / 20 reqs |
|---|---:|---:|
| graphiti **1.7.9** | 10,369 | 6.40 kB |
| graphiti **1.10.2** (stock) | 10,666 (**+297, +2.9%**) | **309.60 kB** |

Per-request, per-sideload, the new code in `Scope#future_with_context` (lib/graphiti/scope.rb:137-161) unconditionally allocates:

- a `Hash` snapshot of every `Thread.current.keys` entry
- a `Hash` snapshot of every `Fiber.current.storage` entry (added in 1.8.2 / [#497](https://github.com/graphiti-api/graphiti/pull/497))
- a `Concurrent::Promises.future_on(...)` future + closure
- nested `with_thread_locals` / `with_fiber_locals` tracker arrays
- a `Rails.application.executor.wrap` block

Plus, per request, `Scope#future_resolve` and `future_resolve_sideloads` add a `then_on` / `zip_futures_on` / `rescue_on` chain. None of this work is useful when concurrency is off — `GLOBAL_THREAD_POOL_EXECUTOR` is already configured as `synchronous: true, max_threads: 0`.

The cost scales linearly with sideload count per request — a heavier endpoint with many `?include=` relationships pays the multiplier.

In production NewRelic data on a Rails 7 service at moderate traffic, this manifests as a visible GC frequency increase between staging (1.10.2) and production (1.7.9).

## Proposed solution

Branch on `Graphiti.config.concurrency` at the public entry points (`Scope#resolve`, `Scope#resolve_sideloads`, `Sideload#load`, `PolymorphicBelongsTo#resolve`) and dispatch to a synchronous path when concurrency is disabled. The synchronous path mirrors 1.7.9 semantics — no Future allocation, no thread/fiber storage snapshots — and reuses 1.10.2 helpers (`broadcast_data`, `assign_serializer`, `before_resolve`, `after_resolve` callbacks).

The `future_*` methods remain unchanged; users with `concurrency = true` get the existing 1.8+ behavior unmodified.

Sketch of the change in `Scope`:

```ruby
def resolve(&block)
  return sync_resolve(&block) unless Graphiti.config.concurrency
  future_resolve.value!
end

def resolve_sideloads(results)
  return sync_resolve_sideloads(results) unless Graphiti.config.concurrency
  future_resolve_sideloads(results).value!
end

private

def sync_resolve
  return [] if @query.zero_results?

  resolved = broadcast_data { |payload|
    @object = @resource.before_resolve(@object, @query)
    payload[:results] = @resource.resolve(@object)
    payload[:results]
  }
  resolved.compact!
  assign_serializer(resolved)
  yield resolved if block_given?
  @opts[:after_resolve]&.call(resolved)
  sync_resolve_sideloads(resolved) unless @query.sideloads.empty?
  resolved
end

def sync_resolve_sideloads(results)
  return if results == []
  @query.sideloads.each_pair do |name, q|
    sideload = @resource.class.sideload(name)
    next if sideload.nil? || sideload.shared_remote?

    Graphiti.config.before_sideload&.call(Graphiti.context)
    sideload.resolve(results, q, @resource)
  end
end
```

`Sideload#load` short-circuits to `build_resource_proxy(...).to_a`, and `Sideload#resolve` mirrors `future_resolve` against the sync `Scope#resolve`. `PolymorphicBelongsTo#resolve` does the same recursive group-by without `Concurrent::Promises.zip`.

## Verification

Local fork against the same Rails app:

| Configuration | Total allocations / request | concurrent-ruby memory / 20 reqs |
|---|---:|---:|
| graphiti 1.7.9 | 10,369 | 6.40 kB |
| graphiti 1.10.2 stock | 10,666 (+297) | 309.60 kB |
| graphiti 1.10.2 + sync-dispatch fork | **10,365 (-4)** | **6.40 kB** |

Allocation profile is statistically indistinguishable from 1.7.9 with concurrency off; `concurrent-ruby` returns to the 6.40 kB baseline (the executor-delay constant, never resolved).

Test results on the fork:

- **graphiti's own suite**: 1372 examples, 0 failures (existing concurrency-off scope_specs needed updates from `expect(sideload).to receive(:future_resolve)` → `expect(sideload).to receive(:resolve)` to reflect the new dispatch).
- **Consuming Rails 7 app**: 1163 examples, 0 failures, 100% line coverage.

## Notes / open questions

1. The `concurrency = true` path is untouched — this is purely additive for the off path. Existing users with concurrency enabled see no behavior change.
2. The 1.8.2 motivation for snapshotting Thread/Fiber storage ([#497](https://github.com/graphiti-api/graphiti/pull/497) "prevent context loss") still holds for the concurrent path. With concurrency off, no thread hop occurs, so there's nothing to preserve.

## Environment

- Ruby 4.0.3
- Rails 7.2.3
- graphiti 1.10.2
- concurrent-ruby 1.3.6
- jsonapi-renderer 0.2.2
- dry-types 1.9.1
- graphiti_errors 1.1.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf issues upgrading from 1.7.9 -> 1.10.2 #505

Graphs for 1.10.2 back to 1.7.9

CLAUDE CODE

Significant per-request allocation regression when `concurrency = false` (1.7.9 → 1.10.2)

Summary

Reproduction

Proposed solution

Verification

Notes / open questions

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Configuration	Total allocations / request	concurrent-ruby memory / 20 reqs
graphiti 1.7.9	10,369	6.40 kB
graphiti 1.10.2 (stock)	10,666 (+297, +2.9%)	309.60 kB

Configuration	Total allocations / request	concurrent-ruby memory / 20 reqs
graphiti 1.7.9	10,369	6.40 kB
graphiti 1.10.2 stock	10,666 (+297)	309.60 kB
graphiti 1.10.2 + sync-dispatch fork	10,365 (-4)	6.40 kB

Perf issues upgrading from 1.7.9 -> 1.10.2 #505

Description

Graphs for 1.10.2 back to 1.7.9

CLAUDE CODE

Significant per-request allocation regression when concurrency = false (1.7.9 → 1.10.2)

Summary

Reproduction

Proposed solution

Verification

Notes / open questions

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Significant per-request allocation regression when `concurrency = false` (1.7.9 → 1.10.2)