Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions tools-and-tests/tools/docs/days-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
- `download-day` - Download all record files for a specific day (v1 implementation)
- `download-days` - Download many days (v1)
- `download-days-v2` - Download many days (v2, newer implementation)
- `download-live2` - Live block download with inline validation and automatic day rollover
- `print-listing` - Print the listing for a given day from listing files
- `ls-day-listing` - Print all files in the listing for a day
- `split-files-listing` - Split a giant JSON listing (files.json) into per-day binary listing files
Expand Down Expand Up @@ -121,6 +122,62 @@

---

### `download-live2`

Live block download with inline validation, signature statistics, and automatic day rollover. Downloads blocks in real-time, validates them, and writes to per-day `.tar.zstd` archives.

Usage:

```
days download-live2 [-l <listingDir>] [-o <outputDir>] [--start-date <YYYY-MM-DD>] [--max-concurrency <n>]
```

Options:
- `-l`, `--listing-dir <listingDir>` — Directory where listing files are stored (default: `listingsByDay`).
- `-o`, `--output-dir <outputDir>` — Directory where compressed day archives are written (default: `compressedDays`).
- `--start-date <YYYY-MM-DD>` — Start date (default: auto-detect from mirror node).

Check notice on line 138 in tools-and-tests/tools/docs/days-commands.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

tools-and-tests/tools/docs/days-commands.md#L138

Expected: Mirror Node; Actual: mirror node
- `--state-json <path>` — Path to state JSON file for resume (default: `outputDir/validateCmdStatus.json`).
- `--stats-csv <path>` — Path to signature statistics CSV file (default: `outputDir/signature_statistics.csv`).
- `--address-book <path>` — Path to address book file for signature validation.
- `--max-concurrency <n>` — Maximum concurrent downloads (default: 64).

Features:
- **Auto-detect start date**: Queries the mirror node to determine the current day if `--start-date` is not specified.

Check notice on line 145 in tools-and-tests/tools/docs/days-commands.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

tools-and-tests/tools/docs/days-commands.md#L145

Expected: Mirror Node; Actual: mirror node
- **HTTP transport**: Uses HTTP (not gRPC) for GCS downloads to avoid deadlock issues with virtual threads.
- **Inline validation**: Validates each block's running hash as it's downloaded.
- **Signature statistics**: Tracks per-day signature counts and writes to CSV (compatible with `validate-with-stats`).
- **Day rollover**: Automatically finalizes day archives at midnight and starts new ones.
- **Resume support**: Saves state periodically to allow resuming after interruption.

Example:

```bash
# Start from a specific date
java -jar tools-all.jar days download-live2 \
-l /path/to/listingsByDay \
-o /path/to/compressedDays \
--start-date 2026-01-09 \
--max-concurrency 64

# Auto-detect today and run continuously
java -jar tools-all.jar days download-live2 \
-l /path/to/listingsByDay \
-o /path/to/compressedDays
```

#### State File Behavior

The command writes its state to `validateCmdStatus.json` in the output directory. This file is shared with the `validate` and `validate-with-stats` commands. Important notes:

| Scenario | Behavior |
|--------------------------------------------|---------------------------------------------------------------------------------------------|
| `download-live2` writes → `validate` reads | ✅ Works (validate ignores extra `blockNumber` field) |
| `validate` writes → `download-live2` reads | ⚠️ `blockNumber` will be 0, so `download-live2` falls back to `--start-date` or auto-detect |

**If you run `validate` or `validate-with-stats` after `download-live2`**, the state file will be overwritten without the `blockNumber` field. When `download-live2` starts again, it will need `--start-date` to resume from the correct position, or it will auto-detect the current day from the mirror node.

Check notice on line 177 in tools-and-tests/tools/docs/days-commands.md

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

tools-and-tests/tools/docs/days-commands.md#L177

Expected: Mirror Node; Actual: mirror node

---

### `print-listing`

Prints a curated listing for a single day from listing files created by the download process.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import org.hiero.block.tools.days.subcommands.DownloadDay;
import org.hiero.block.tools.days.subcommands.DownloadDays;
import org.hiero.block.tools.days.subcommands.DownloadDaysV2;
import org.hiero.block.tools.days.subcommands.DownloadDaysV3;
import org.hiero.block.tools.days.subcommands.DownloadLive;
import org.hiero.block.tools.days.subcommands.DownloadLive2;
import org.hiero.block.tools.days.subcommands.FixMissingSignatures;
Expand Down Expand Up @@ -38,6 +39,7 @@
Compress.class,
DownloadDay.class,
DownloadDaysV2.class,
DownloadDaysV3.class,
DownloadDays.class,
DownloadLive.class,
DownloadLive2.class,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
import org.hiero.block.tools.utils.Gzip;
import org.hiero.block.tools.utils.Md5Checker;
import org.hiero.block.tools.utils.PrettyPrint;
import org.hiero.block.tools.utils.gcp.ConcurrentDownloadManagerVirtualThreads;
import org.hiero.block.tools.utils.gcp.ConcurrentDownloadManagerVirtualThreadsV3;

/**
* Manages the download, validation, and archival of blockchain blocks in live (streaming) mode.
Expand Down Expand Up @@ -81,7 +81,7 @@ public class LiveDownloader {
private final Path addressBookPath;
private final Path runningHashStatusPath;
private final AddressBookRegistry addressBookRegistry;
private final ConcurrentDownloadManagerVirtualThreads downloadManager;
private final ConcurrentDownloadManagerVirtualThreadsV3 downloadManager;
// Running previous record-file hash used to validate the block hash chain across files.
private byte[] previousRecordFileHash;
// Single-threaded executor used for background compression of per-day tar files.
Expand Down Expand Up @@ -145,12 +145,10 @@ public LiveDownloader(
this.addressBookRegistry = new AddressBookRegistry();
}

Storage storage = StorageOptions.grpc()
.setAttemptDirectPath(false)
.setProjectId(GCP_PROJECT_ID)
.build()
.getService();
this.downloadManager = ConcurrentDownloadManagerVirtualThreads.newBuilder(storage)
// Use HTTP transport + platform threads for stability (avoids gRPC/Netty deadlocks)
Storage storage =
StorageOptions.http().setProjectId(GCP_PROJECT_ID).build().getService();
this.downloadManager = ConcurrentDownloadManagerVirtualThreadsV3.newBuilder(storage)
.setMaxConcurrency(maxConcurrency)
.build();
this.compressionExecutor = Executors.newSingleThreadExecutor(r -> {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,8 @@ public NodeAddressBook getAddressBookForBlock(Instant blockTime) {
: addressBooks.get(i - 1).addressBook();
}
}
// if no address book is found, return the genesis address book
return addressBooks.getFirst().addressBook();
// if no address book is found after the block time, return the most recent address book
return addressBooks.getLast().addressBook();
}

/**
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
// SPDX-License-Identifier: Apache-2.0
package org.hiero.block.tools.days.subcommands;

import static org.hiero.block.tools.days.download.DownloadConstants.GCP_PROJECT_ID;
import static org.hiero.block.tools.days.download.DownloadDayImplV2.downloadDay;
import static org.hiero.block.tools.mirrornode.DayBlockInfo.loadDayBlockInfoMap;

import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
import java.io.File;
import java.nio.file.Path;
import java.time.LocalDate;
import java.util.Map;
import org.hiero.block.tools.metadata.MetadataFiles;
import org.hiero.block.tools.mirrornode.BlockTimeReader;
import org.hiero.block.tools.mirrornode.DayBlockInfo;
import org.hiero.block.tools.utils.gcp.ConcurrentDownloadManager;
import org.hiero.block.tools.utils.gcp.ConcurrentDownloadManagerVirtualThreadsV3;
import picocli.CommandLine.Command;
import picocli.CommandLine.Option;
import picocli.CommandLine.Parameters;

@SuppressWarnings({"FieldCanBeLocal", "FieldMayBeFinal", "CallToPrintStackTrace"})
@Command(
name = "download-days-v3",
description = "Download all record files for a specific day. V3 with Guava preload fix")
public class DownloadDaysV3 implements Runnable {

@Option(
names = {"-l", "--listing-dir"},
description = "Directory where listing files are stored")
private Path listingDir = MetadataFiles.LISTINGS_DIR;

@Option(
names = {"-d", "--downloaded-days-dir"},
description = "Directory where downloaded days are stored")
private File downloadedDaysDir = new File("compressedDays");

@Option(
names = {"-t", "--threads"},
description = "How many days to download in parallel")
private int threads = 100; // Reduced from 1000 to avoid virtual thread pinning with gRPC

@Parameters(index = "0", description = "From year to download")
private int fromYear = 2019;

@Parameters(index = "1", description = "From month to download")
private int fromMonth = 9;

@Parameters(index = "2", description = "From day to download")
private int fromDay = 13;

@Parameters(index = "3", description = "To year to download")
private int toYear = LocalDate.now().getYear();

@Parameters(index = "4", description = "To month to download")
private int toMonth = LocalDate.now().getMonthValue();

@Parameters(index = "5", description = "To day to download")
private int toDay = LocalDate.now().getDayOfMonth();

@Override
public void run() {
try (BlockTimeReader blockTimeReader = new BlockTimeReader();
// Use HTTP transport + platform threads for most stable operation
Storage storage = StorageOptions.http()
.setProjectId(GCP_PROJECT_ID)
.build()
.getService();
ConcurrentDownloadManager downloadManager = ConcurrentDownloadManagerVirtualThreadsV3.newBuilder(
storage)
.setInitialConcurrency(64)
.setMaxConcurrency(threads)
.build()) {
// Load day block info map
final Map<LocalDate, DayBlockInfo> daysInfo = loadDayBlockInfoMap();
final var days = LocalDate.of(fromYear, fromMonth, fromDay)
.datesUntil(LocalDate.of(toYear, toMonth, toDay).plusDays(1))
.toList();
final long totalDays = days.size();
final long overallStartMillis = System.currentTimeMillis();
byte[] previousRecordHash = null;
for (int i = 0; i < days.size(); i++) {
final LocalDate localDate = days.get(i);
DayBlockInfo dayBlockInfo = daysInfo.get(localDate);
try {
previousRecordHash = downloadDay(
downloadManager,
dayBlockInfo,
blockTimeReader,
listingDir,
downloadedDaysDir.toPath(),
localDate.getYear(),
localDate.getMonthValue(),
localDate.getDayOfMonth(),
previousRecordHash,
totalDays,
i, // progressStart as day index (0-based)
overallStartMillis);
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);

Check warning on line 102 in tools-and-tests/tools/src/main/java/org/hiero/block/tools/days/subcommands/DownloadDaysV3.java

View check run for this annotation

Codacy Production / Codacy Static Code Analysis

tools-and-tests/tools/src/main/java/org/hiero/block/tools/days/subcommands/DownloadDaysV3.java#L102

Avoid throwing raw exception types.
}
}
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
}
Loading
Loading