Skip to content

Conversation

@TDemeco
Copy link
Contributor

@TDemeco TDemeco commented Jan 26, 2026

We received a more-than-valid concern about the process that a BSP has to take to stop storing a certain file being too complicated to do by hand, since it requires crafting several proofs and waiting for a certain amount of blocks in the middle of it all, so in this PR we:

  • Add a new RPC bspStopStoringFile that allows BSP runners to easily initiate and complete the process to stop storing a certain file key, be it because they want to or they have been required to.
    • This RPC works by queuing a RequestBspStopStoringRequest in the Blockchain Service's persistent state. The request is then processed when the forest root write lock becomes available, ensuring atomic proof generation and submission. We were not able to emit Blockchain Service commands from the RPC because of a circular dependency that we also solve in this PR, more info below.
    • The task itself leverages the existing forest root write lock mechanism to ensure that concurrent stop storing requests don't invalidate each other's proofs. The task consists of two handlers:
      1. The first handler gets triggered by receiving the ProcessBspRequestStopStoring Blockchain Service event which is emitted when the forest root write lock is available and a request is pending. This handler is the one in charge of getting the file's metadata and generating the required inclusion proof to then be able to call the bsp_request_stop_storing extrinsic.
      2. The second handler reacts to the ProcessBspConfirmStopStoring event, which is emitted when the on-chain event BspRequestedToStopStoring is detected and the minimum wait period has passed. Before submitting the confirm, the handler checks on-chain that the pending stop storing request still exists (to handle reorgs or manual confirmations). It then generates a new inclusion proof (since the BSP's forest could have changed in the meantime) and sends the bsp_confirm_stop_storing extrinsic.
    • Both phases use a queue-based processing system with persistent storage (RocksDB-backed deques) to ensure requests survive node restarts. The queues are processed sequentially with the forest root write lock to prevent concurrent proof invalidation.
    • If an extrinsic fails due to a proof-related error (e.g., ForestProofVerificationFailed, FailedToApplyDelta from fisherman mutations), the request is automatically requeued for retry with a maximum of 3 attempts.
    • The confirm stop storing queue uses a peek_front method to check if the first item's tick has been reached without modifying the queue, since items are chronologically ordered.
    • An RAII guard (ForestLockGuard) is used to automatically release the forest root write lock when the handler completes, regardless of how it returns (success, error, or panic).
  • Add two new runtime APIs for the File System pallet:
    • query_min_wait_for_stop_storing: Query the minimum ticks that a BSP has to wait between requesting to stop storing a file and being able to confirm to stop storing it.
    • has_pending_stop_storing_request: Check if a pending stop storing request exists for a BSP and file key. This is used to verify the request still exists before submitting the confirm extrinsic.
    • These also have their complementary Blockchain Service commands QueryMinWaitForStopStoring and HasPendingStopStoringRequest.
  • Add a new RPC method getAllStoredFileKeys as a helper for a BSP runner that wants to obtain the list of file keys it is currently storing. This plus the bspStopStoringFile RPC method will be the way for a BSP to sign off for now until we design a simpler mechanism for the BSP to stop storing all its files.
  • The big refactor: made it so the RPC has access to the Blockchain Service and as such can emit Blockchain Service commands. To do this we:
    • Made the RpcHandlers used by the Blockchain Service to send extrinsics a Arc<RwLock<Option<Arc<RpcHandlers>>>> instead of a Arc<RpcHandlers>. This allows us to not have to set them when creating the Blockchain Service, and we can set them afterwards by using the SetRpcHandlers command.
    • Tasks that want to send extrinsics through the Blockchain Service now have to acquire a read lock to the RPC handlers, which should be ok as the only write lock we acquire is with the aforementioned command that is only called once during initialization.
    • The RPC configuration now has a pub blockchain: Option<ActorHandle<BlockchainService<FSH, Runtime>>> field. This field is optional as for example the fisherman actor spawns the RPC service but does not have a Blockchain Service running.
    • Fixed a race condition for Leader nodes with a pending transactions DB caused by this change: startup resubscription of pending transactions is now deferred until SetRpcHandlers is called, preventing "RPC handlers not yet available" errors during initialization.

⚠️ Breaking Changes ⚠️

  • Short description

    There are two new runtime APIs under the File System pallet: query_min_wait_for_stop_storing and has_pending_stop_storing_request. Runtime managers of runtimes that use StorageHub will have to implement them.
    There has also been a change in the parameters received by init_sh_builder so any node managers that use it will have to be updated.
    Finally, there's new columns in the RocksDB permanent storage of the node.

  • Who is affected

    • Runtime managers of runtimes that use StorageHub since the new runtime APIs have to be implemented
    • Node managers that use the StorageHub node and instantiate it with init_sh_builder and finish_sh_builder_and_run_tasks
  • Suggested code changes

    Implement the runtime APIs:

    impl pallet_file_system_runtime_api::FileSystemApi<Block, AccountId, BackupStorageProviderId, MainStorageProviderId, H256, BlockNumber, ChunkId, BucketId, StorageRequestMetadata, BucketId, StorageDataUnit, H256> for Runtime {
    ...
    fn query_min_wait_for_stop_storing() -> BlockNumber {
    FileSystem::query_min_wait_for_stop_storing()
    }

      fn has_pending_stop_storing_request(bsp_id: BackupStorageProviderId<Runtime>, file_key: H256) -> bool {
          FileSystem::has_pending_stop_storing_request(bsp_id, file_key)
      }
      ...
    

    }

    Add the new parameters to init_sh_builder:

    // Get the base path for the node (the RocksDB root path)
    let base_path = config.base_path.path().to_path_buf().clone();
    
    // Set whether the node starts in maintenance mode or not.
    let maintenance_mode = false;
    
    // Build StorageHub
    let (sh_builder, maybe_storage_hub_client_rpc_config) = match init_sh_builder::<R, S, Runtime>(
          &role_options,
          &indexer_options,
          &task_manager,
          file_transfer_request_protocol,
          network.clone(),
          keystore_container.keystore(),
          client.clone(),
          prometheus_registry.as_ref(),
          base_path.clone(),
          maintenance_mode
      )
      .await?
      {
          Some((shb, rpc)) => (Some(shb), Some(rpc)),
          None => (None, None),
      };

@TDemeco TDemeco added B5-clientnoteworthy Changes should be mentioned client-related release notes B7-runtimenoteworthy Changes should be noted in any runtime-upgrade release notes breaking Needs to be mentioned in breaking changes D4-nicetohaveaudit⚠️ PR contains trivial changes to logic that should be properly reviewed. labels Jan 26, 2026
@TDemeco TDemeco requested review from ffarall and snowmead January 27, 2026 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

B5-clientnoteworthy Changes should be mentioned client-related release notes B7-runtimenoteworthy Changes should be noted in any runtime-upgrade release notes breaking Needs to be mentioned in breaking changes D4-nicetohaveaudit⚠️ PR contains trivial changes to logic that should be properly reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants