refactor(backfill): Backfill Plugin Major Refactor And Improvements #2006

AlfredoG87 · 2025-12-31T22:32:07Z

Summary

Refactors the BackfillPlugin from a monolithic ~650-line class into a modular architecture of focused, testable components.

Key change: Introduces a dual-scheduler design that separates historical and live-tail backfill processing—ensuring recent gaps are never blocked by long-running historical catch-up.

Architecture Overview

BackfillPlugin now acts as an orchestrator coordinating three stages:

1. Detect — `GapDetector`

Scans storage for missing block ranges
Classifies gaps as HISTORICAL (older) or LIVE_TAIL (recent/near-head)
Simplifies Greedy Backfill by defining an upper bound based on true/false -> peerMax/latestStoredBlock

2. Schedule — `BackfillTaskScheduler`

Routes gaps into dedicated bounded queues by type
Historical and live-tail backfills run independently

3. Execute — `BackfillRunner`

Selects optimal peer via PriorityHealthBasedStrategy
Fetches blocks in configurable batches
Awaits persistence confirmation before fetching more (natural backpressure)

New Components

Class	Responsibility
`BackfillRunner`	Core execution logic
`BackfillTaskScheduler`	Bounded queue task scheduling
`BackfillPersistenceAwaiter`	Persistence-based backpressure
`GapDetector`	Gap detection and classification
`PriorityHealthBasedStrategy`	Health-aware node selection
`NodeSelectionStrategy`	Node selection abstraction

Configuration

Property	Default	Description
`backfill.historicalQueueCapacity`	20	Max pending gaps for historical scheduler
`backfill.liveTailQueueCapacity`	10	Max pending gaps for live-tail scheduler
`backfill.healthPenaltyPerFailure`	1000.0	Health score penalty per node failure
`backfill.maxBackoffMs`	300000	Maximum backoff duration (5 min)

Other Changes

Exposed WebClient HTTP/2 tuning parameters with high-throughput defaults
Extended BlockNodeSource proto with optional NodeId and Name fields (non-breaking)
Updated design and configuration documentation
Reduced annoyingly large logs in the MessagingFacility Impl

Review guide (recommended order)

If you want the fastest mental model, I recommend reviewing in this order:

GapDetector – gap identification + HISTORICAL vs LIVE_TAIL classification
BackfillTaskScheduler – dual bounded queues + worker lifecycle
BackfillRunner – node selection → fetch → dispatch → await persistence loop
BackfillPersistenceAwaiter – how backpressure is enforced via notifications
Config + docs – new properties, defaults, and documentation updates

PR Stats

Category	Lines	% of PR
Tests	~2,459	70%
Main Source	~1,282	36%
Docs	~261	7%

Tests (70%): 102 test methods covering critical backfill infrastructure.
Main Source (36%): Significant architectural refactor - BackfillPlugin split into 6 focused components (BackfillRunner, GapDetector, TaskScheduler, etc.)
Docs (7%): Updated design documentation

Related Issues

Fixes #1977
Fixes #1550
Fixes #1502
Fixes #1778

ata-nas

First pass on this one, will need at least another one. Have not yet looked at tests as well.

I leave general comments for cleanup and questions.

I feel, however, that we are really missing out on not using the ranged sets and instead we are using a list of ranges and manually manipulating them. I do see usages of merging ranges, this is something the ranged sets will do automatically in a more performant fashion. I guess I just do not see why we need to use a list and complicating some of the things we do, instead of just using the ranged sets. Since we are doing such a big rework, now is the time to make decisions. Something worth thinking about.

...de/backfill/src/main/java/org/hiero/block/node/backfill/client/proto/block_node_source.proto

block-node/backfill/src/main/java/org/hiero/block/node/backfill/client/BlockNodeClient.java

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillConfiguration.java

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillPersistenceAwaiter.java

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillRunner.java

block-node/backfill/src/main/java/org/hiero/block/node/backfill/GapDetector.java

...k-node/backfill/src/main/java/org/hiero/block/node/backfill/PriorityHealthBasedStrategy.java

AlfredoG87 · 2026-01-06T04:30:30Z

@ata-nas Thank you for reviewing my PR, really appreciate and value all your input and time effort made into reviewing it 🙏

I've addressed all your comments mostly in a positive outcome.

And for your general notes suggestion:

I feel, however, that we are really missing out on not using the ranged sets and instead we are using a list of ranges and manually manipulating them. I do see usages of merging ranges, this is something the ranged sets will do automatically in a more performant fashion. I guess I just do not see why we need to use a list and complicating some of the things we do, instead of just using the ranged sets. Since we re doing such a big rework, now is the time to make decisions. Something worth thinking about.

I've ponder this up and decided that the benefits do not outweigh the effort and added complexity. The use case we need is simpler as opposed to the BlockRangeSet impls and prefer to keep it simple as a native List.

Introduce typed gaps to classify detected block gaps as HISTORICAL or LIVE_TAIL for routing to appropriate schedulers. GapDetector now returns TypedGap instances with proper boundary detection. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Replace single scheduler with two independent schedulers (historical and live-tail) so live blocks never wait for historical backfill. Each scheduler has bounded queue with discard-on-full semantics. Remove unused BackfillScheduler wrapper and BackfillTask status tracking. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Update BackfillPlugin to orchestrate two independent schedulers with dedicated executors. Add high-water mark deduplication for live-tail gaps. Add configuration for queue capacities and health penalty settings. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Merge gRPC client functionality into BackfillFetcher. Use configurable health penalty and backoff settings. Remove redundant BackfillGrpcClient. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

@timeout

Add @timeout annotations to all test classes (30s for integration, 5s for unit tests) to fail fast if tests hang instead of blocking indefinitely. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Update configuration docs with new queue capacity and health settings. Update design docs to reflect dual scheduler architecture. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

…gic into BackfillPersistenceAwaiter class. Improved logs overall for the plugin. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Simplified TypedGap and GapType into nested classes of GapDetector for readability and code reduction Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillConfiguration.java

jsync-swirlds

About half way through, mostly logging issues so far.

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillConfiguration.java

block-node/backfill/src/main/java/module-info.java

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillPersistenceAwaiter.java

...de/backfill/src/main/java/org/hiero/block/node/backfill/client/proto/block_node_source.proto

ata-nas

My second review of the PR.
This is only reviewing tests! (Pt.1, still have a couple of test classes to go through).
This took a very long time to review thoroughly.

As a general comment, I think we will really benefit of not using any mocks, but develop simple test fixtures. Moreover, for things such as metrics, we already have an approach to create them for tests w/o any mocks.

As mentioned in my previous review, this major rework is the best time to make the best decisions.

ata-nas · 2026-01-13T09:22:39Z

block-node/backfill/src/test/java/org/hiero/block/node/backfill/BackfillFetcherTest.java

+    private static BlockUnparsed createTestBlock(long blockNumber) {
+        Block block = new Block(Arrays.asList(SimpleTestBlockItemBuilder.createSimpleBlockWithNumber(blockNumber)));
+        return BlockUtils.toBlockUnparsed(block);
+    }


The SimpleTestBlockItemBuilder should have methods to create unparsed blocks directly, if I am not mistaken.

Good Catch!! 💯

✅ Done!

ata-nas · 2026-01-13T09:24:31Z

block-node/backfill/src/test/java/org/hiero/block/node/backfill/BackfillFetcherTest.java

+        return new BackfillPlugin.MetricsHolder(
+                mock(Counter.class), // backfillGapsDetected
+                mock(Counter.class), // backfillFetchedBlocks
+                mock(Counter.class), // backfillBlocksBackfilled
+                mock(Counter.class), // backfillFetchErrors
+                mock(Counter.class), // backfillRetries
+                mock(LongGauge.class), // backfillStatus
+                mock(LongGauge.class), // backfillPendingBlocksGauge
+                mock(LongGauge.class)); // backfillInFlightGauge


We should not use mocks here at all. We already have the test helper which creates metrics which we can actually call to assert on. Please refer to: TestUtils#createMetrics and it's sample usages.

I like using Mocks for metrics for their simplicity and versatility for assertions, but if we already have a testing framework I agree we should keep it standard.

I have made the refactor.

✅ Done!

ata-nas · 2026-01-13T09:25:30Z

block-node/backfill/src/test/java/org/hiero/block/node/backfill/BackfillFetcherTest.java

+            final BackfillFetcher client = newClient(lowPriority, highPriority);
+
+            // Selects earliest start even if higher priority
+            var selection = client.selectNextChunk(


Non-blocking nit: it would be much better, in my opinion, to never use var, considered everywhere. Optional, however.

Yes, I have made the changes too.

✅ Done!

ata-nas · 2026-01-13T09:27:00Z

block-node/backfill/src/test/java/org/hiero/block/node/backfill/BackfillFetcherTest.java

+
+        @Test
+        @Timeout(value = 10, unit = TimeUnit.SECONDS)
+        @DisplayName("returns blocks on success, retries on failure, returns empty on mismatch")


Nit: judging by the display name, these are 3 separate tests.

Separated tests, there is a trade-off here since I wanted to use a single test for several different assertions to make the review easier due to the amount of lines of code was already quite a lot, so tried to compact and save some lines on testing, but lets split them so they are isolated and better practice overall 👍

✅ Done!

Yeah, the reason here is that if something fails then the rest will not run and we will not know if that logic is still incorrect. If we split the tests, then we give the chance for everything to run.

agreed, hence why I did the proposed changes in all the occurrences.

ata-nas · 2026-01-13T09:27:15Z

block-node/backfill/src/test/java/org/hiero/block/node/backfill/BackfillFetcherTest.java

+    class SelectNextChunkTests {
+
+        @Test
+        @DisplayName("selects earliest start, breaks ties by priority, returns empty when no match")


Nit: judging by the display name, these are 3 separate tests.

Same as above, splited into multiple tests

✅ Done!

ata-nas · 2026-01-13T12:56:26Z

...de/backfill/src/test/java/org/hiero/block/node/backfill/PriorityHealthBasedStrategyTest.java

+        @Test
+        @DisplayName("should adjust start block to available range start")
+        void shouldAdjustStartToAvailableRange() {
+            BackfillSourceConfig node = createNode("localhost", 8080, 1);
+            // Available from block 50, but we need from 0
+            Map<BackfillSourceConfig, List<LongRange>> availability = Map.of(node, List.of(new LongRange(50, 100)));
+
+            Optional<NodeSelection> result = strategy.select(0, 100, availability);
+
+            assertTrue(result.isPresent());
+            assertEquals(50, result.get().startBlock());


Q: In cases like this, is the behavior that we desire this? Should we return a partial range that is available or should we return empty, as in "what you requested is not available to be provided"?

Empty is the correct behavior. The caller asked "give me a node that can serve blocks starting at X". If no node can serve starting at X, returning empty accurately answers "what you requested is not available". Returning a partial range starting at Y would be misleading - the caller would need to handle the gap between X and Y separately anyway. The strategy's job is to match the request exactly or report no match.

Maybe we can revisit this as a further optimization in the future, since I do see your point of 2 is better than 0.

ata-nas · 2026-01-13T12:58:07Z

...de/backfill/src/test/java/org/hiero/block/node/backfill/PriorityHealthBasedStrategyTest.java

+    class PrioritySelectionTests {
+
+        @Test
+        @DisplayName("should select node with lowest priority number (highest priority)")


Personally, this display name is a bit confusing to me. We could just say "select node with highest priority".

Renamed display name from "lowestPriorityNumber" to "should select node with highest priority" (clearer intent)
✅ Done!

ata-nas · 2026-01-13T13:00:14Z

...de/backfill/src/test/java/org/hiero/block/node/backfill/PriorityHealthBasedStrategyTest.java

+        @Test
+        @DisplayName("should return empty when best priority node is in backoff (no fallback to lower priority)")
+        void shouldReturnEmptyWhenBestPriorityInBackoff() {
+            // Note: The algorithm filters by best priority FIRST, then removes backoff nodes.
+            // It does NOT fall back to lower priority nodes if best priority is in backoff.
+            BackfillSourceConfig highPriority = createNode("high", 8080, 1);
+            BackfillSourceConfig lowPriority = createNode("low", 8081, 10);
+            healthProvider.setInBackoff(highPriority, true);
+            Map<BackfillSourceConfig, List<LongRange>> availability = new HashMap<>();
+            availability.put(highPriority, List.of(new LongRange(0, 100)));
+            availability.put(lowPriority, List.of(new LongRange(0, 100)));
+
+            Optional<NodeSelection> result = strategy.select(0, 100, availability);
+
+            // Returns empty because only priority-1 node exists, and it's in backoff
+            assertTrue(result.isEmpty());
+        }


Q: Interesting! What is the user story behind this? Not returning a node config if it exists, but is not the highest priority available, because that one is in back-off?

Returning empty triggers the retry mechanism in backfillGap of BackfillRunner class. The while loop calls fetchNextChunk which, on empty selection, replans availability by re-querying all nodes. If the backoff expired, the node becomes available and we retry. This keeps retrying until the preferred node recovers or we truly exhaust all options - rather than immediately falling back to lower priority nodes. This prevents from shifting the load to a lower priority node for a hiccup or transient error that quickly recovers on a high priority node, it also allows for returning to a high priority node faster since will Retry every max backoff periods if higher priority node is available.

Priority reflects prefference, a BN with higher priority could be cheaper or faster, so we want to make sure we always exhaust all the lower priority nodes, and that we come back to the highest priority nodes as soon as possible once they become available again.

ata-nas · 2026-01-13T13:02:00Z

...de/backfill/src/test/java/org/hiero/block/node/backfill/PriorityHealthBasedStrategyTest.java

+        @DisplayName("should select healthier node when priorities are equal")
+        void shouldSelectHealthierNodeWhenSamePriority() {
+            BackfillSourceConfig unhealthy = createNode("unhealthy", 8080, 1);
+            BackfillSourceConfig healthy = createNode("healthy", 8081, 1);
+            healthProvider.setHealthScore(unhealthy, 10.0);
+            healthProvider.setHealthScore(healthy, 1.0);
+            Map<BackfillSourceConfig, List<LongRange>> availability = new HashMap<>();
+            availability.put(unhealthy, List.of(new LongRange(0, 100)));
+            availability.put(healthy, List.of(new LongRange(0, 100)));
+
+            Optional<NodeSelection> result = strategy.select(0, 100, availability);
+
+            assertTrue(result.isPresent());
+            assertEquals(healthy, result.get().nodeConfig());
+        }
+    }


Q: In this test, are we absolutely sure that we get the healthier node because of it's health, or simply because it is in the right place?

Add comments explaining that selection is by health score, not insertion order
And yes I doubled checked that is the case.

✅ Done!

ata-nas · 2026-01-13T13:14:53Z

block-node/backfill/src/test/java/org/hiero/block/node/backfill/GapDetectorTest.java

+    @Test
+    @DisplayName("respects end cap and Long.MAX_VALUE")
+    void respectsEndCap() {
+        // Caps gap at endCap
+        var gaps = detector.findTypedGaps(List.of(new LongRange(0, 4)), 0, 10, 7);
+        assertEquals(1, gaps.size());
+        assertEquals(new LongRange(5, 7), gaps.getFirst().range());
+
+        // No trailing gap for Long.MAX_VALUE
+        gaps = detector.findTypedGaps(List.of(new LongRange(0, 10)), 0, 5, Long.MAX_VALUE);
+        assertTrue(gaps.isEmpty());
+
+        // Skips gaps where start > endCap
+        gaps = detector.findTypedGaps(List.of(new LongRange(0, 5), new LongRange(100, 200)), 0, 10, 3);
+        assertTrue(gaps.isEmpty());
+    }


Here, please note that as of now, the LongRange class does NOT support Long.MAX_VALUE as an end to the range! The reason being is the correct working of the size method thereof.

Good catch on noting this! The code already handles this case - in GapDetector.findGaps() line 58, there's an explicit check endBlock != Long.MAX_VALUE that prevents creating a LongRange with MAX_VALUE as the end. The test at line 96 passes Long.MAX_VALUE as the endCap parameter (not as a LongRange end), and the guard ensures we never actually construct an invalid range.

…formatter with this file. - Applied logging performance convention, changed all logging to use built-in string interpolation within the logger. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

…der, also improved the logging as per jsync suggestion to use the comments on the PBJ generated classes as javadocs Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

…ualityCheck for now as the current PBJ is improve logging on final failed retries to a peer Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

- Use unparsed block builder directly - Replace mock metrics with real TestUtils.createMetrics() - Replace var with explicit types - Split multi-scenario tests into focused single-behavior tests Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

jsync-swirlds · 2026-01-13T23:37:11Z

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java

+                final String failedToFetchBlocksMsg =
+                        "Failed to fetch blocks [%s->%s] from node [%s] (attempt %d/%d): %s-%s"
+                                .formatted(
+                                        blockRange.start(),
+                                        blockRange.end(),
+                                        nodeConfig.address(),
+                                        attempt,
+                                        maxRetries,
+                                        e.getMessage(),
+                                        e.getCause());
+                LOGGER.log(INFO, failedToFetchBlocksMsg, e);


Because you're trying to not log stack trace, you may want to prefer the following:

Suggested change

final String failedToFetchBlocksMsg =

"Failed to fetch blocks [%s->%s] from node [%s] (attempt %d/%d): %s-%s"

.formatted(

blockRange.start(),

blockRange.end(),

nodeConfig.address(),

attempt,

maxRetries,

e.getMessage(),

e.getCause());

LOGGER.log(INFO, failedToFetchBlocksMsg, e);

final String format = "Failed to fetch blocks [{0}->{1}] from node [{2}] (attempt {3}/{4}) due to {5}{6}.";

final String cause = e.getCause() != null ? ", " + e.getCause() : "";

LOGGER.log(INFO, message, blockRange.start(), blockRange.end(), nodeConfig.address(), attempt, maxRetries, e, cause);

This is a bit clearer, avoids formatting if not necessary, and avoids logging any stack traces.

For even more succinct logging, you might consider making this log statement DEBUG level.

Applied recommended changes! Thank you!

✅ Done!

jsync-swirlds · 2026-01-13T23:39:10Z

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java

+                if (attempt == maxRetries) {
+                    markFailure(nodeConfig);
+                    // Only log exception stack trace on final failure to prevent log spam
+                    LOGGER.log(TRACE, "Final failure stack trace:", e);


This should probably be INFO level, as this represents actual failure to fetch data...

Suggested change

LOGGER.log(TRACE, "Final failure stack trace:", e);

LOGGER.log(TRACE, "Final failure stack trace.\n", e);

Also replaced the : with \n as it's often easier to read the logs when stack trace starts on the next line.

jsync-swirlds · 2026-01-13T23:43:27Z

block-node/backfill/src/main/proto/block_node_source.proto

+   * Valid range: 100ms to 300,000ms
+   */
+  uint32 ping_timeout = 11;
+}


Missing newline at EOF.

Good catch! Thanks!

✅ Done!

jsync-swirlds · 2026-01-13T23:44:27Z

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java

+                    node.address(),
+                    node.port(),
+                    node.priority(),
+                    tuning != null ? "custom" : "defaults");


Very minor: you could also include the tuning details as follows:

Suggested change

tuning != null ? "custom" : "defaults");

tuning == null ? "defaults" : tuning);

The logger message formatting will call tuning.toString() if needed; and PBJ-generated classes all include a pretty decent toString() method.

I like the PBJ generated toString, we lose some flexibility but we gain simplicity and readability of the codebase, seems like an acceptable trade-off. we get the needed information which is what matters in the end.

✅ Done!

…tenceAwaiter Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

…t names, use assertSame for reference checks Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

- Simplify node logging using PBJ toString() - Use log format replacement instead of .formatted() - Add missing newline at EOF (block_node_source.proto) Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

codecov · 2026-01-14T20:56:50Z

Codecov Report

❌ Patch coverage is 87.56906% with 90 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
.../org/hiero/block/node/backfill/BackfillPlugin.java	75.53%	33 Missing and 13 partials ⚠️
...ro/block/node/backfill/client/BlockNodeClient.java	77.96%	10 Missing and 3 partials ⚠️
...org/hiero/block/node/backfill/BackfillFetcher.java	95.10%	3 Missing and 4 partials ⚠️
...ock/node/backfill/PriorityHealthBasedStrategy.java	88.13%	4 Missing and 3 partials ⚠️
...lock/node/backfill/BackfillPersistenceAwaiter.java	89.47%	5 Missing and 1 partial ⚠️
...ill/client/BlockStreamSubscribeUnparsedClient.java	42.85%	2 Missing and 2 partials ⚠️
.../org/hiero/block/node/backfill/BackfillRunner.java	98.30%	1 Missing and 1 partial ⚠️
...ero/block/node/backfill/BackfillTaskScheduler.java	95.12%	1 Missing and 1 partial ⚠️
...ava/org/hiero/block/node/backfill/GapDetector.java	93.54%	1 Missing and 1 partial ⚠️
...ock/node/messaging/BlockMessagingFacilityImpl.java	83.33%	1 Missing ⚠️

@@             Coverage Diff              @@
##               main    #2006      +/-   ##
============================================
+ Coverage     78.99%   79.82%   +0.82%     
- Complexity     1241     1334      +93     
============================================
  Files           130      136       +6     
  Lines          5952     6250     +298     
  Branches        646      688      +42     
============================================
+ Hits           4702     4989     +287     
- Misses          955      962       +7     
- Partials        295      299       +4

Files with missing lines	Coverage Δ	Complexity Δ
...ero/block/node/backfill/BackfillConfiguration.java	`100.00% <ø> (ø)`	`1.00 <0.00> (ø)`
...ero/block/node/backfill/NodeSelectionStrategy.java	`100.00% <100.00%> (ø)`	`0.00 <0.00> (?)`
...ero/block/node/spi/historicalblocks/LongRange.java	`100.00% <100.00%> (ø)`	`30.00 <5.00> (+5.00)`
...ock/node/messaging/BlockMessagingFacilityImpl.java	`85.93% <83.33%> (+0.16%)`	`43.00 <2.00> (ø)`
.../org/hiero/block/node/backfill/BackfillRunner.java	`98.30% <98.30%> (ø)`	`31.00 <31.00> (?)`
...ero/block/node/backfill/BackfillTaskScheduler.java	`95.12% <95.12%> (ø)`	`16.00 <16.00> (?)`
...ava/org/hiero/block/node/backfill/GapDetector.java	`93.54% <93.54%> (ø)`	`11.00 <11.00> (?)`
...ill/client/BlockStreamSubscribeUnparsedClient.java	`59.49% <42.85%> (-3.18%)`	`4.00 <1.00> (ø)`
...lock/node/backfill/BackfillPersistenceAwaiter.java	`89.47% <89.47%> (ø)`	`16.00 <16.00> (?)`
...org/hiero/block/node/backfill/BackfillFetcher.java	`95.10% <95.10%> (ø)`	`34.00 <34.00> (?)`
... and 3 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jsync-swirlds · 2026-01-15T01:14:32Z

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillConfiguration.java

+// We use the https://github.com/palantir/palantir-java-format formatter.
+// @ConfigProperty(defaultValue = "10") – force a line break after the annotation.
+// It is a non-configurable format without complete documentation
+// we might consider switching to another formatter in the future.


This line isn't true; we're pretty locked to the Palantir formatter for all Hiero projects.

jsync-swirlds · 2026-01-15T01:17:31Z

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java

+                        nodeConfig.address(),
+                        attempt,
+                        maxRetries,
+                        e.getMessage(),


I still advocate using e in preference to e.getMessage() in log messages. Too many standard exceptions have no message (or an incomplete message) and rely on the type name (which is included in the toString() output) to make the "message" useable.

Non-blocking, however.

AlfredoG87 self-assigned this Jan 1, 2026

AlfredoG87 modified the milestones: 0.27.0, 0.26.0 Jan 1, 2026

AlfredoG87 added the Block Node Issues/PR related to the Block Node. label Jan 1, 2026

AlfredoG87 changed the title ~~refactor(backfill): Backfill Plugin Major Refactoring~~ refactor(backfill): Backfill Plugin Major Refactor Jan 2, 2026

AlfredoG87 changed the title ~~refactor(backfill): Backfill Plugin Major Refactor~~ refactor(backfill): Backfill Plugin Major Refactor And Improvements Jan 2, 2026

AlfredoG87 marked this pull request as ready for review January 2, 2026 22:44

AlfredoG87 requested review from a team as code owners January 2, 2026 22:44

AlfredoG87 requested review from Nana-EC, ata-nas, jsync-swirlds and mustafauzunn January 2, 2026 22:44

AlfredoG87 added the Improvement Code changes driven by non business requirements label Jan 3, 2026

ata-nas reviewed Jan 5, 2026

View reviewed changes

AlfredoG87 requested a review from ata-nas January 6, 2026 04:46

AlfredoG87 force-pushed the backfill-improvements2 branch from 0d1abe4 to b5048d2 Compare January 6, 2026 05:50

AlfredoG87 added 11 commits January 5, 2026 23:50

refactor(backfill): consolidate fetcher and remove BackfillGrpcClient

c00a6b2

Merge gRPC client functionality into BackfillFetcher. Use configurable health penalty and backoff settings. Remove redundant BackfillGrpcClient. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

test(backfill): add timeout annotations to prevent test hangs

178c8d4

Add @timeout annotations to all test classes (30s for integration, 5s for unit tests) to fail fast if tests hang instead of blocking indefinitely. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

docs(backfill): update documentation for dual scheduler architecture

db1d66f

Update configuration docs with new queue capacity and health settings. Update design docs to reflect dual scheduler architecture. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

stylefix

48bcb8a

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Restoring Persistence Backpressure to blocks by Encapsulating that lo…

474db3e

…gic into BackfillPersistenceAwaiter class. Improved logs overall for the plugin. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Added and improved UT coverage for new components

c44f0c3

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

further UT improvements and deduplication

8fc0974

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

further UT improvements and deduplication

d1947f4

Simplified TypedGap and GapType into nested classes of GapDetector for readability and code reduction Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

small nits

40c838c

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

AlfredoG87 requested a review from jsync-swirlds January 9, 2026 04:40

mustafauzunn reviewed Jan 9, 2026

View reviewed changes

block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillConfiguration.java Show resolved Hide resolved

jsync-swirlds reviewed Jan 10, 2026

View reviewed changes

ata-nas reviewed Jan 13, 2026

View reviewed changes

AlfredoG87 added 8 commits January 13, 2026 13:29

- disabled spotless on BackfillConfiguration due to known issue with …

a73247d

…formatter with this file. - Applied logging performance convention, changed all logging to use built-in string interpolation within the logger. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

PR Review suggestions

0f274e4

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

as per jjohanes review, we should have proto files in its default fol…

e107cae

…der, also improved the logging as per jsync suggestion to use the comments on the PBJ generated classes as javadocs Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

removing unnecessary require on module-info.java

27bd0ae

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

restoring requires org.antlr.antlr4.runtime as is really needed for q…

febb305

…ualityCheck for now as the current PBJ is improve logging on final failed retries to a peer Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

use real metrics and explicit types in BackfillFetcherTest

3dd263c

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

add client reference assertion in stale connection test

b737359

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

jsync-swirlds reviewed Jan 13, 2026

View reviewed changes

AlfredoG87 added 6 commits January 13, 2026 21:35

Test Improvements and removal of test-only method from BackfillPersis…

ac4ada8

…tenceAwaiter Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

fix display names and improve test clarity per PR review

8299df1

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

removing redundant test

bd5364a

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Address PR review feedback: remove redundant config test, clarify tes…

b69b335

…t names, use assertSame for reference checks Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

codacy suggestion

c4739f9

Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

Address jsync PR review feedback (latest)

6989497

- Simplify node logging using PBJ toString() - Use log format replacement instead of .formatted() - Add missing newline at EOF (block_node_source.proto) Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>

AlfredoG87 requested review from ata-nas, jjohannes, jsync-swirlds and mustafauzunn January 14, 2026 06:48

jsync-swirlds reviewed Jan 15, 2026

View reviewed changes

	LOGGER.log(TRACE, "Final failure stack trace:", e);
	LOGGER.log(TRACE, "Final failure stack trace.\n", e);

	tuning != null ? "custom" : "defaults");
	tuning == null ? "defaults" : tuning);

refactor(backfill): Backfill Plugin Major Refactor And Improvements #2006

Are you sure you want to change the base?

refactor(backfill): Backfill Plugin Major Refactor And Improvements #2006

Uh oh!

Conversation

AlfredoG87 commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture Overview

1. Detect — GapDetector

2. Schedule — BackfillTaskScheduler

3. Execute — BackfillRunner

New Components

Configuration

Other Changes

Review guide (recommended order)

PR Stats

Related Issues

Uh oh!

ata-nas left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AlfredoG87 commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jsync-swirlds left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ata-nas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AlfredoG87 Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

AlfredoG87 commented Dec 31, 2025 •

edited

Loading

1. Detect — `GapDetector`

2. Schedule — `BackfillTaskScheduler`

3. Execute — `BackfillRunner`

ata-nas left a comment •

edited

Loading

AlfredoG87 commented Jan 6, 2026 •

edited

Loading

AlfredoG87 Jan 14, 2026 •

edited

Loading