Skip to content

[Feature Request]: Implement block storage partitioning for scalable volume distribution #4100

Open
@aWN4Y25pa2EK

Description

@aWN4Y25pa2EK

Implementation ideas

Overview

Currently, all *.ods and *.q4 datasets are stored under a single blocks/ path without filesystem partitioning. This monolithic storage approach creates scalability challenges and potential performance bottlenecks.

Risks and Challenges

  • Storage Capacity Constraints

    • Since all of the datasets are stored on the same root path as blocks keep increasing this could represent a major challenge in terms of storage distribution and scalability of the volumes.
  • Performance Bottlenecks

    • Single directory containing all blocks impacts file lookup performance
    • Potential I/O contention when multiple processes access the same directory
    • Limited ability to optimize for specific storage hardware characteristics

Proposed Enhancement

Implement a partitioning strategy that would:

  • Implement a two-level hierarchical partitioning strategy based on block hash prefixes
  • Create 256 primary partitions using the first two hexadecimal characters
  • Enable flexible volume distribution across storage resources

Partitioning Example

Existing structure (no partitioning, all files are stored on the same root path blocks/):

├── 00E1584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods
├── 10E2584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods

Partitioning example through the use of the first 2 bytes [00->FF] would create a structure of 256 indexes.

blocks/
├── 0X/
│   ├── E1584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods
│   ├── E2584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods
├── 1X/
│   ├── E1584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods
│   ├── E2584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods
├── XX/
│   ├── E1584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods
│   ├── E2584FF07A13371E6A293EAC970EF42F753C474E0737D93EF1430944227441.ods

Example of volume distribution with partitioning enabled:

Volume 0: 0x/ (Blocks starting with 0)
Volume 1: 1x/ (Blocks starting with 1)
Volume 2: 2x/ (Blocks starting with 2)
Volume 3: 3x/ (Blocks starting with 3)
Volume 4: 4x/ (Blocks starting with 4)
Volume 5: 5x/ (Blocks starting with 5)
Volume 6: 6x/ (Blocks starting with 6)
Volume 7: 7x/ (Blocks starting with 7)
Volume 8: 8x/ (Blocks starting with 8)
Volume 9: 9x/ (Blocks starting with 9)
Volume 10: Ax/ (Blocks starting with A)
Volume 11: Bx/ (Blocks starting with B)
Volume 12: Cx/ (Blocks starting with C)
Volume 13: Dx/ (Blocks starting with D)
Volume 14: Ex/ (Blocks starting with E)
Volume 15: Fx/ (Blocks starting with F)

How this would fix existing limitations ?

When deploying a Data Availability (DA) node on cloud infrastructure, service providers face a critical limitation: cloud platforms typically impose a hard storage limit per volume. Since DA nodes currently store all *.ods datasets in a single root path, this creates an absolute ceiling that cannot be bypassed.

The proposed partitioning strategy provides a robust solution to these storage constraints

Volume Distribution:

  • Creates 256 distinct indexes (00-FF) based on block hash prefixes
  • Distributes these indexes across up to 16 separate volumes (0-F)
  • Each volume handles blocks with specific prefix ranges

Example of Storage Capacity Benefits based on a limit of 10TB per block storage (volume):

  • Number of volumes: 16 (one for each hex prefix)
  • Total theoretical capacity: 160TB per DA node
  • Scalability factor: 16x increase from baseline

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestexternalIssues created by non node team members

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions