Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 78 additions & 1 deletion app/helpers/application_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# OpenSSF Best Practices badge contributors
# SPDX-License-Identifier: MIT

module ApplicationHelper
module ApplicationHelper # rubocop:disable Metrics/ModuleLength
include Pagy::Frontend

# Frozen string constant for unknown project names (memory optimization)
Expand Down Expand Up @@ -83,9 +83,39 @@ def cache_frozen_unless(condition, name = {}, options = {}, &)
cache_frozen_if(!condition, name, options, &)
end

# Cache metrics (enabled via CACHE_PROFILE=1).
# Metrics are written to tmp/cache_metrics.json every 100 requests.
# Read with: script/cache_metrics_report.rb
CACHE_METRICS = {} # rubocop:disable Style/MutableConstant
CACHE_METRICS_MUTEX = Mutex.new
CACHE_METRICS_FILE = Rails.root.join('tmp/cache_metrics.json')
CACHE_METRICS_WRITE_INTERVAL = 100

def self.cache_metrics
CACHE_METRICS_MUTEX.synchronize do
CACHE_METRICS.values.sort_by { |m| -m[:hit_allocs] }
end
end

def self.cache_metrics_reset
CACHE_METRICS_MUTEX.synchronize { CACHE_METRICS.clear }
end

def self.cache_metrics_save
CACHE_METRICS_MUTEX.synchronize do
File.write(CACHE_METRICS_FILE, JSON.pretty_generate(CACHE_METRICS.values))
end
end

private

def cache_frozen_perform(name, options, &)
return cache_frozen_perform_profiled(name, options, &) if ENV['CACHE_PROFILE']

cache_frozen_perform_fast(name, options, &)
end

def cache_frozen_perform_fast(name, options, &)
cache_key = controller.combined_fragment_cache_key(
cache_fragment_name(name, **options.slice(:skip_digest))
)
Expand All @@ -96,5 +126,52 @@ def cache_frozen_perform(name, options, &)
end
safe_concat(fragment)
end

# rubocop:disable Metrics/AbcSize, Metrics/MethodLength
def cache_frozen_perform_profiled(name, options, &)
alloc_before = GC.stat(:total_allocated_objects)
cache_key = controller.combined_fragment_cache_key(
cache_fragment_name(name, **options.slice(:skip_digest))
)
fragment = controller.cache_store.read(cache_key, options)

if fragment
# HIT: measure only the overhead (key gen + read + concat)
safe_concat(fragment)
hit_allocs = GC.stat(:total_allocated_objects) - alloc_before
record_cache_metric(name, true, hit_allocs, 0)
else
# MISS: measure overhead separately from rendering
alloc_after_read = GC.stat(:total_allocated_objects)
fragment = output_buffer.capture(&).freeze
controller.cache_store.write(cache_key, fragment, options)
safe_concat(fragment)
total_allocs = GC.stat(:total_allocated_objects) - alloc_before
overhead_allocs = alloc_after_read - alloc_before # key gen + read
record_cache_metric(name, false, overhead_allocs, total_allocs)
end
end
# rubocop:enable Metrics/AbcSize, Metrics/MethodLength

# rubocop:disable Metrics/AbcSize, Metrics/MethodLength
def record_cache_metric(name, hit, overhead_allocs, miss_total_allocs)
key = name.is_a?(Array) ? name.map(&:to_s).join('/') : name.to_s
total = nil
CACHE_METRICS_MUTEX.synchronize do
m = CACHE_METRICS[key] ||= {
key: key, hits: 0, misses: 0, hit_allocs: 0, miss_allocs: 0
}
if hit
m[:hits] += 1
m[:hit_allocs] += overhead_allocs
else
m[:misses] += 1
m[:miss_allocs] += miss_total_allocs
end
total = CACHE_METRICS.values.sum { |v| v[:hits] + v[:misses] }
end
ApplicationHelper.cache_metrics_save if (total % CACHE_METRICS_WRITE_INTERVAL).zero?
end
# rubocop:enable Metrics/AbcSize, Metrics/MethodLength
# rubocop:enable Rails/OutputSafety
end
92 changes: 92 additions & 0 deletions docs/internal-cache-analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Internal cache analysis

This application has an internal cache, in particular
its fragment cache system. When often-reused items are
cached and reused, that can *greatly* reduce the load on the system.

However, every cache check must create and use a cache key
(which itself does memory allocations and takes time), and
every cached value must be retrieved often enough to be worth it
(else it will use up memory until evicted and take up space
better used elsewhere.

We have a system for measuring cache hits to see which
requests for a cache are actually useful. Basically, you enable
cache profiling (which stores data in tmp/) in the application,
and run a stress test like this:

~~~~sh
CACHE_PROFILE=1 rails s
script/memory_stress_test.rb --crawler --duration 2h^C
~~~~

It's important that the stress test be representative of real data.
When we originally set up our cache system, we presumed that we would
only occasionlly undergo a web spider, and that most requests would be
from users editing their pages (which would focus on specific projects in
a specific locale). Today, the site undergoes relentless spidering.
This massive change in our typical input profile means that the
correct caching strategy has to change too.

## Initial results

Here are the initial results as of 2026-01-29.

~~~~
CACHE REMOVAL CANDIDATES BY SOURCE (least effective first)
================================================================================

1. views/layouts/application.html.erb:40
Code: <% cache_frozen request.original_fullpath do -%>
Score: 40.7/100 | Hit rate: 0.0%
Hits: 0 (0.0 allocs/hit = overhead cost)
Misses: 11254 (533.3 allocs/miss = render cost)
Problems: hit rate 0.0%
Unique keys: 11254

2. views/projects/_table.html.erb:23
Code: <% cache_frozen [project, locale], expires_in: 12.hours do %>
Score: 46.6/100 | Hit rate: 52.1%
Hits: 13173 (49.3 allocs/hit = overhead cost)
Misses: 12102 (264.0 allocs/miss = render cost)
Problems: hit rate 52.1%, 49 allocs/hit (overhead)
Unique keys: 19243

3. views/layouts/_header.html.erb:36
Code: <% cache_frozen [I18n.locale, request.original_fullpath] do %>
Score: 49.0/100 | Hit rate: 0.0%
Hits: 0 (0.0 allocs/hit = overhead cost)
Misses: 11263 (951.5 allocs/miss = render cost)
Problems: hit rate 0.0%
Unique keys: 11263

4. views/criteria/show.html.erb:1
Code: <% cache_frozen [locale, @criteria_level, @details, @rationale, @autofill] do -%>
Score: 50.0/100 | Hit rate: 0.0%
Hits: 0 (0.0 allocs/hit = overhead cost)
Misses: 13 (4524.9 allocs/miss = render cost)
Problems: hit rate 0.0%, low samples (13)
Unique keys: 13

5. views/criteria/index.html.erb:1
Code: <% cache_frozen [locale, @details, @rationale, @autofill] do -%>
Score: 76.8/100 | Hit rate: 99.4%
Hits: 269784 (35.0 allocs/hit = overhead cost)
Misses: 1557 (376.9 allocs/miss = render cost)
Unique keys: 1557

6. views/layouts/_footer.html.erb:1
Code: <% cache_frozen locale do -%>
Score: 91.2/100 | Hit rate: 99.7%
Hits: 22491 (29.0 allocs/hit = overhead cost)
Misses: 63 (7686.5 allocs/miss = render cost)
Unique keys: 9

--------------------------------------------------------------------------------
Total source locations: 20, Analyzed: 43339 keys

Score interpretation:
- Low score + high allocs/hit: expensive overhead, consider removing
- Low score + low hit rate: cache not being reused, consider removing
- High allocs/miss: rendering is expensive, cache may still be valuable
~~~~
Loading