refactor(reaping): use cache for seen hashes instead of in memory store #2811

julienrbrt · 2025-11-04T08:49:55Z

Overview

Use cache instead of in memory store for reaper

github-actions · 2025-11-04T08:50:08Z

The latest Buf updates on your PR. Results from workflow CI and Release / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Nov 4, 2025, 3:15 PM

claude · 2025-11-04T08:50:11Z

Claude finished @julienrbrt's task —— View job

Code Review Complete

I've completed a comprehensive review of this PR. Overall, this is a solid architectural improvement that centralizes transaction hash tracking and enables persistence. The code quality is good, with excellent test coverage for the new functionality. However, I've identified several areas for improvement and one critical issue that needs attention.

Summary

What Changed:

Migrated transaction "seen" hash tracking from isolated in-memory datastore in Reaper to centralized cache.Manager
Added IsTxSeen, SetTxSeen, and CleanupOldTxs methods to cache.Manager
Implemented disk persistence for transaction cache with hourly cleanup mechanism
Renamed test utility: NewDefaultInMemoryKVStore → NewTestInMemoryKVStore

Impact:

✅ Better architecture: Centralized caching layer
✅ Persistence: Transaction seen status survives restarts
⚠️ I/O overhead: New disk writes on shutdown (see discussion below)

Critical Issues

1. ⚠️ Compilation Error in Test File

Location: block/internal/reaping/reaper_test.go:9-10

There's a duplicate import causing a compilation error (also caught by Gemini's review):

import (
    ds "github.com/ipfs/go-datastore"
    dssync "github.com/ipfs/go-datastore/sync"
    // ... other imports
)

But then on lines 228 and 253, the code uses:

dataStore := dssync.MutexWrap(ds.NewMapDatastore())  // line 228
dataStore2 := dssync.MutexWrap(ds.NewMapDatastore()) // line 253

This should work correctly as-is with the aliased import. Please verify the test file compiles.

Architecture & Design

✅ Strengths

Clean Separation of Concerns: Moving tx hash tracking to cache.Manager is the right architectural decision. It consolidates all caching logic in one place.
Generic Cache Reuse: Smart use of Cache[struct{}] for transaction hashes. This avoids code duplication and leverages existing infrastructure.
Explicit Design Decisions: Excellent documentation explaining why txCache is NOT cleared in DeleteHeight() (lines 244-250 in manager.go). This kind of reasoning documentation prevents future bugs.
Time-Based Cleanup: Using time-based expiration (24h default) instead of height-based cleanup makes sense for transaction deduplication.

🔍 Areas for Improvement

1. Architectural Question: Disk Persistence Trade-off

Location: block/internal/cache/manager.go:323-325, SaveToDisk() method

@tac0turtle raised a valid concern about disk I/O. Let me add context:

Current Behavior:

Transaction hashes are persisted to disk on shutdown (via SaveToDisk())
On restart, they're loaded and timestamps reset to time.Now() (line 362)
This prevents re-submission of transactions seen before restart

Trade-offs:

Approach	Pros	Cons
Persist to disk (current)	Prevents duplicate tx submissions across restarts; consistent with other cache behavior	Extra I/O on shutdown; minimal benefit if restarts are rare
In-memory only	No I/O overhead; simpler	Transactions seen before restart may be resubmitted (though sequencer should handle this)

Recommendation: I lean toward @tac0turtle's suggestion for in-memory only, but with caveats:

The sequencer should already handle duplicate transactions (idempotency)
The 24-hour window is long enough that restart-related duplicates are unlikely to cause issues
Consistency argument: If this is truly ephemeral data, why persist it?

However, if you keep persistence:

The current implementation is correct
The I/O happens only on graceful shutdown (not hot path)
Consistency with other caches (header, data) is maintained

Suggested Action: Decide explicitly and document the rationale. If going in-memory only:

Remove txCache from SaveToDisk() and LoadFromDisk()
Update the comment at line 331-333 to explain why

2. Missing Cleanup on Shutdown

Location: block/internal/reaping/reaper.go:143-150, Stop() method

The reaper's Stop() method doesn't save the cache to disk. If persistence is desired, you should:

func (r *Reaper) Stop() error {
    if r.cancel != nil {
        r.cancel()
    }
    r.wg.Wait()

    // Save cache to disk on graceful shutdown
    if err := r.cache.SaveToDisk(); err != nil {
        r.logger.Warn().Err(err).Msg("failed to save cache on shutdown")
    }

    r.logger.Info().Msg("reaper stopped")
    return nil
}

Note: This is only relevant if keeping disk persistence.

3. Timestamp Inconsistency on Load

Location: block/internal/cache/manager.go:360-368, LoadFromDisk() method

When loading from disk, ALL transaction timestamps are reset to time.Now():

now := time.Now()
m.txCache.hashes.Range(func(key, value any) bool {
    if hash, ok := key.(string); ok {
        m.txTimestamps.Store(hash, now)
    }
    return true
})

Issue: This defeats the purpose of time-based cleanup. A transaction that was 23 hours old at shutdown will be treated as "new" on restart, extending its lifetime to 47 hours total.

Options:

Persist timestamps (complex, requires gob registration of time.Time, more I/O)
Use conservative timestamp on load (e.g., time.Now().Add(-23 * time.Hour) to ensure old txs are cleaned up soon)
Accept the behavior as reasonable (document it)
Go in-memory only (removes the problem entirely)

Recommendation: If keeping persistence, document this behavior explicitly. If it's a problem, persist timestamps or go in-memory only.

Code Quality & Go Idioms

✅ Good Practices

Error Wrapping: Consistent use of fmt.Errorf with %w for error context
Concurrency Safety: Proper use of sync.Map for concurrent access
Zero Value Handling: CleanupOldTxs defaults to DefaultTxCacheRetention when passed 0 (line 201-203)
Logging: Appropriate log levels and structured logging with context

🔍 Minor Issues

1. Type Assertion Without Second Return Value

Location: block/internal/cache/generic_cache.go:72

func (c *Cache[T]) isSeen(hash string) bool {
    seen, ok := c.hashes.Load(hash)
    if !ok {
        return false
    }
    return seen.(bool)  // ⚠️ Panic risk if not bool
}

This type assertion could panic if the value isn't a bool. Use the two-value form:

return seen.(bool)  // Current
// Better:
if seenBool, ok := seen.(bool); ok {
    return seenBool
}
return false

Though in practice this is safe (only bools are stored), defensive programming is better.

2. Direct Access to Internal Cache Field

Location: block/internal/cache/manager.go:221

m.txCache.hashes.Delete(hash)

This directly accesses txCache.hashes, breaking encapsulation. Consider adding a method to Cache[T]:

func (c *Cache[T]) deleteSeen(hash string) {
    c.hashes.Delete(hash)
}

Then use: m.txCache.deleteSeen(hash)

3. Inconsistent Use of Constants

Location: block/internal/reaping/reaper.go:99

cleanupTicker := time.NewTicker(1 * time.Hour)

This magic number should be a constant, similar to DefaultInterval:

const (
    DefaultInterval = 1 * time.Second
    MaxBackoffInterval = 30 * time.Second
    BackoffMultiplier = 2
    TxCleanupInterval = 1 * time.Hour  // Add this
)

Error Handling & Resilience

✅ Strengths

Graceful Error Handling: SaveToDisk() and LoadFromDisk() errors are logged as warnings, not failures (lines 135, 140)
Retry Logic: Reaper has exponential backoff on errors (lines 111-122)
Transaction Rollback: If sequencer submission fails, transactions are NOT marked as seen (line 186-187)

🔍 Potential Issues

No Cleanup on Load Failure

Location: block/internal/cache/manager.go:338-358, LoadFromDisk()

If loading one cache fails (e.g., corrupted data), the error is returned immediately, but partially loaded data remains in memory. Consider:

Clearing all caches on load failure
Continuing to load other caches even if one fails
At minimum, document the behavior

Testing

✅ Excellent Coverage

The test suite is comprehensive and well-structured:

TestManager_TxOperations - Basic functionality
TestManager_CleanupOldTxs - Time-based cleanup
TestManager_CleanupOldTxs_SelectiveRemoval - Partial cleanup
TestManager_TxCache_PersistAndLoad - Disk persistence
TestManager_DeleteHeight_PreservesTxCache - Height-based deletion behavior
TestReaper_CachePersistence - End-to-end persistence

The tests use proper mocking, table-driven patterns where appropriate, and cover edge cases.

🔍 Testing Gaps

No test for concurrent access to IsTxSeen/SetTxSeen (though sync.Map handles this)
No test for cleanup during high transaction volume
No test for LoadFromDisk() with corrupted data

Performance Considerations

✅ Good Decisions

sync.Map Usage: Appropriate for read-heavy workloads (checking if tx is seen)
Lazy Cleanup: Hourly cleanup ticker instead of per-transaction checks
Single Cleanup Loop: Efficient iteration over timestamps with early deletion

🔍 Potential Concerns

1. Double Iteration in Cleanup

Location: block/internal/cache/manager.go:208-226, CleanupOldTxs()

The cleanup iterates txTimestamps, then deletes from both txCache.hashes and txTimestamps. This is two sync.Map operations per transaction. Not a big issue, but could be optimized by batch deletion if performance becomes critical.

2. Memory Growth Between Cleanups

With hourly cleanup and high transaction volume, the cache could grow significantly:

At 1000 tx/sec: 3.6M entries in 1 hour
Each entry: ~50-100 bytes (hash string + timestamp + map overhead)
Memory: ~180-360 MB/hour

The 1-hour cleanup interval seems reasonable, but monitor in production.

3. Disk I/O on Shutdown

If disk persistence is kept, shutdown time increases with cache size:

Gob encoding 3.6M entries could take several seconds
Not a problem for graceful shutdown, but could delay restarts

Security

✅ No Major Concerns

Transaction hashes are SHA-256, so no injection risks
No sensitive data in logs
Disk permissions are properly set (0o755 for directories)

🔍 Minor Consideration

Disk Space Exhaustion: If cleanup fails or is disabled, the cache could grow unbounded on disk. Consider:

Maximum cache size limit
Monitoring/alerting on cache directory size
Automatic cleanup on disk usage threshold

Documentation

✅ Good Documentation

Comments explain non-obvious decisions (e.g., why txCache isn't deleted by height)
Exported functions have clear doc comments
Constants are documented (e.g., DefaultTxCacheRetention)

🔍 Improvement Suggestions

Add package-level comment to cache package explaining the architecture
Document cleanup interval and how to tune it
Explain persistence strategy in cache.Manager interface doc

Specific File Reviews

`block/internal/cache/manager.go`

Lines 190-194: SetTxSeen Implementation

func (m *implementation) SetTxSeen(hash string) {
    m.txCache.setSeen(hash, 0)
    m.txTimestamps.Store(hash, time.Now())
}

✅ Correct use of height=0 for transactions (not yet in a block)
⚠️ Consider: Should this be idempotent? Currently, calling SetTxSeen twice updates the timestamp, extending lifetime. This might be intentional, but document it.

Lines 244-250: DeleteHeight Comment

Excellent! This kind of explanatory comment prevents future bugs.

`block/internal/reaping/reaper.go`

Lines 95-140: reaperLoop Implementation

✅ Clean implementation with dual tickers (reaping + cleanup)
✅ Exponential backoff on errors
✅ Cleanup logging

Lines 168-172: Transaction Filtering

for _, tx := range txs {
    txHash := hashTx(tx)
    if !r.cache.IsTxSeen(txHash) {
        newTxs = append(newTxs, tx)
    }
}

✅ Simple and correct. Consider pre-allocating newTxs if len(txs) is large:

newTxs := make([][]byte, 0, len(txs))

`block/internal/cache/generic_cache.go`

Lines 75-79: setSeen with Height Tracking

func (c *Cache[T]) setSeen(hash string, height uint64) {
    c.hashes.Store(hash, true)
    c.hashByHeight.Store(height, hash)  // ⚠️ Potential issue
}

⚠️ Issue: hashByHeight maps height -> hash, but multiple hashes can have the same height. This overwrites previous entries. For transactions (height=0), this means only the LAST transaction marked as seen is tracked by height.

Impact: In deleteAllForHeight(), only one hash per height is deleted from hashes. For transactions at height=0, this could leave orphaned entries.

Recommendation: Change hashByHeight to map[height][]hash or accept that height-based deletion doesn't apply to transactions (which is the current documented behavior).

Recommendations

High Priority

✅ Fix compilation error in test file (verify imports work correctly)
🔧 Decide on persistence strategy: In-memory only vs. disk persistence, and document the rationale
📝 Add cleanup on shutdown if keeping persistence

Medium Priority

🔍 Review timestamp reset behavior on load (lines 360-368)
🛡️ Add defensive type assertions in isSeen() and similar methods
📦 Encapsulate cache field access (avoid m.txCache.hashes.Delete())
📊 Add constant for cleanup interval (currently hardcoded as 1 hour)

Low Priority

📝 Add package-level documentation to cache package
🧪 Add concurrency stress tests
🎯 Pre-allocate slice in transaction filtering loop (minor optimization)

Conclusion

This PR represents a solid architectural improvement with good code quality. The main decision point is whether to persist transaction hashes to disk or keep them in-memory only. I recommend making this decision explicit and documenting the rationale.

The test coverage is excellent, and the implementation is mostly sound. The critical issue is the potential compilation error in the test file, which should be verified before merging.

Overall Assessment: ✅ Approve with minor revisions

Once the persistence strategy is decided and documented, and the test compilation is verified, this is ready to merge.

gemini-code-assist · 2025-11-04T08:50:27Z

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the transaction reaping mechanism by migrating the storage of "seen" transaction hashes from an isolated in-memory datastore within the Reaper component to a centralized and persistent cache.Manager. This change streamlines the transaction tracking process, enhances data consistency across components, and improves the overall architecture by leveraging a unified caching layer for various data types. The refactoring also includes updates to the cache manager to support transaction-specific operations and ensures proper persistence of this state.

Highlights

Centralized Transaction Cache: The Reaper component now utilizes a shared cache.Manager to track seen transaction hashes, replacing its previous internal in-memory datastore.
Cache Manager Enhancements: The cache.Manager interface and its implementation have been extended with new methods (IsTxSeen, SetTxSeen) and a dedicated txCache to manage transaction seen status, including persistence to disk.
Refactored Reaper Logic: The Reaper's SubmitTxs method has been updated to interact with the new cache.Manager for efficient checking and marking of seen transactions, simplifying its internal logic.
Test Utility Renaming and Simplification: The NewDefaultInMemoryKVStore function has been renamed to NewTestInMemoryKVStore and simplified for testing purposes, with all relevant test files and documentation updated to use this new naming convention.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2025-11-04T08:51:13Z

PR Preview Action v1.6.2
Preview removed because the pull request was closed.
2025-11-04 15:37 UTC

gemini-code-assist

Code Review

This pull request refactors the reaper component to use a centralized cache.Manager for tracking seen transaction hashes, instead of its own in-memory datastore. This is a solid architectural improvement that centralizes caching logic and enables persistence of seen transactions across node restarts. The changes are well-implemented, and the addition of a persistence test for the cache is a great touch.

I've found a minor issue in a test file related to a duplicate package import which will cause a compilation error. I've left comments with suggestions on how to fix it. Overall, this is a great refactoring that improves the codebase.

gemini-code-assist · 2025-11-04T08:52:55Z

block/internal/reaping/reaper_test.go

 	"testing"
 	"time"

+	"github.com/ipfs/go-datastore"


This import duplicates the aliased import ds for the same package on the next line. This will cause a compilation error. Please remove this line and use the ds alias consistently throughout the file.

block/internal/reaping/reaper_test.go

codecov · 2025-11-04T08:53:20Z

Codecov Report

❌ Patch coverage is 71.42857% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.61%. Comparing base (271f74b) to head (a7e42e4).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
block/internal/cache/manager.go	80.00%	4 Missing and 4 partials ⚠️
block/internal/reaping/reaper.go	27.27%	7 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2811      +/-   ##
==========================================
+ Coverage   62.37%   62.61%   +0.23%     
==========================================
  Files          82       82              
  Lines        7304     7334      +30     
==========================================
+ Hits         4556     4592      +36     
+ Misses       2203     2197       -6     
  Partials      545      545

Flag	Coverage Δ
combined	`62.61% <71.42%> (+0.23%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

block/internal/cache/manager.go

tac0turtle

im more leaning towards making this short lived, Writing to disk here seems like extra IO we dont really need to care about

julienrbrt · 2025-11-04T10:17:55Z

im more leaning towards making this short lived, Writing to disk here seems like extra IO we dont really need to care about

can do. writing it disk does only happen at stopping tho.

refactor(reaping): use cache for seen hashes instead of in memory store

a07b1fc

github-project-automation bot added this to Evolve Nov 4, 2025

github-actions bot assigned julienrbrt Nov 4, 2025

gemini-code-assist bot reviewed Nov 4, 2025

View reviewed changes

julienrbrt added 2 commits November 4, 2025 09:57

lint

086a996

Merge branch 'main' into julien/use-cache-seenstore

0750be4

tac0turtle reviewed Nov 4, 2025

View reviewed changes

block/internal/cache/manager.go Show resolved Hide resolved

tac0turtle reviewed Nov 4, 2025

View reviewed changes

julienrbrt added 3 commits November 4, 2025 15:40

autoclean

d5aa3e5

Merge branch 'main' into julien/use-cache-seenstore

af9c353

cl

c9542de

tac0turtle previously approved these changes Nov 4, 2025

View reviewed changes

cl

a7e42e4

julienrbrt dismissed tac0turtle’s stale review via a7e42e4 November 4, 2025 15:15

julienrbrt requested a review from tac0turtle November 4, 2025 15:15

julienrbrt enabled auto-merge November 4, 2025 15:15

tac0turtle approved these changes Nov 4, 2025

View reviewed changes

julienrbrt added this pull request to the merge queue Nov 4, 2025

Merged via the queue into main with commit 3d98502 Nov 4, 2025
30 of 32 checks passed

julienrbrt deleted the julien/use-cache-seenstore branch November 4, 2025 15:36

github-project-automation bot moved this to Done in Evolve Nov 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(reaping): use cache for seen hashes instead of in memory store #2811

refactor(reaping): use cache for seen hashes instead of in memory store #2811

Uh oh!

julienrbrt commented Nov 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

claude bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Nov 4, 2025

Uh oh!

github-actions bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 4, 2025

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

tac0turtle left a comment

Uh oh!

julienrbrt commented Nov 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

refactor(reaping): use cache for seen hashes instead of in memory store #2811

refactor(reaping): use cache for seen hashes instead of in memory store #2811

Uh oh!

Conversation

julienrbrt commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Uh oh!

github-actions bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete

Summary

Critical Issues

1. ⚠️ Compilation Error in Test File

Architecture & Design

✅ Strengths

🔍 Areas for Improvement

1. Architectural Question: Disk Persistence Trade-off

2. Missing Cleanup on Shutdown

3. Timestamp Inconsistency on Load

Code Quality & Go Idioms

✅ Good Practices

🔍 Minor Issues

1. Type Assertion Without Second Return Value

2. Direct Access to Internal Cache Field

3. Inconsistent Use of Constants

Error Handling & Resilience

✅ Strengths

🔍 Potential Issues

No Cleanup on Load Failure

Testing

✅ Excellent Coverage

🔍 Testing Gaps

Performance Considerations

✅ Good Decisions

🔍 Potential Concerns

1. Double Iteration in Cleanup

2. Memory Growth Between Cleanups

3. Disk I/O on Shutdown

Security

✅ No Major Concerns

🔍 Minor Consideration

Documentation

✅ Good Documentation

🔍 Improvement Suggestions

Specific File Reviews

block/internal/cache/manager.go

block/internal/reaping/reaper.go

block/internal/cache/generic_cache.go

Recommendations

High Priority

Medium Priority

Low Priority

Conclusion

Uh oh!

gemini-code-assist bot commented Nov 4, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

julienrbrt commented Nov 4, 2025 •

edited

Loading

github-actions bot commented Nov 4, 2025 •

edited

Loading

claude bot commented Nov 4, 2025 •

edited

Loading

`block/internal/cache/manager.go`

`block/internal/reaping/reaper.go`

`block/internal/cache/generic_cache.go`

github-actions bot commented Nov 4, 2025 •

edited

Loading

codecov bot commented Nov 4, 2025 •

edited

Loading