Skip to content

Commit 1596a80

Browse files
vilenariosclaude
andcommitted
docs: fix CDB64 documentation based on CTO feedback PE-8929
- Fix db source description: "requires locally parsing ANS-104 bundles to index discovered items" (more accurate than "requires local indexing") - Fix incorrect "without network requests" claim - default shipped index stores partitions on Arweave, so network requests ARE made (with caching) - Add partitioning explanation: records partitioned by first byte of binary data item ID (hex prefix 00-ff) - Clarify that --partitioned flag creates manifest.json automatically - Clarify upload tool workflow: uploads partitions, resolves offsets, updates manifest with arweave-bundle-item locations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 79880fd commit 1596a80

File tree

1 file changed

+13
-6
lines changed

1 file changed

+13
-6
lines changed

content/build/run-a-gateway/manage/cdb64.mdx

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,9 @@ When your gateway receives a request for a data item (content inside an ANS-104
1818

1919
The gateway checks multiple sources when resolving a data item ID to its root transaction. The order is controlled by `ROOT_TX_LOOKUP_ORDER`:
2020

21-
1. **db** - Your local SQLite database (fastest, but requires local indexing)
21+
1. **db** - Your local SQLite database (fastest, but requires locally parsing ANS-104 bundles to index discovered items)
2222
2. **gateways** - HEAD requests to other AR.IO gateways
23-
3. **cdb** - CDB64 file-based index (O(1) lookup, no network required)
23+
3. **cdb** - CDB64 file-based index (O(1) lookup from local files or cached remote data)
2424
4. **graphql** - GraphQL queries to trusted gateways
2525

2626
The default configuration tries each source in order until a match is found:
@@ -38,7 +38,7 @@ The shipped CDB64 index covers:
3838
- Data items with content types
3939
- Block heights 0 through 1,820,000
4040

41-
This means most historical ArDrive, Akord, and similar application data will resolve instantly without network requests.
41+
This means most historical ArDrive, Akord, and similar application data can be resolved via the CDB64 index. The default shipped index stores partition data on Arweave, so network requests are made to fetch CDB data (with intelligent byte-range caching). For zero network latency, you can download the CDB files locally.
4242

4343
## Configuration Options
4444

@@ -115,7 +115,7 @@ When enabled, you can add new index files to the directory without restarting yo
115115

116116
## Partitioned Indexes
117117

118-
Large CDB64 indexes can be split across up to 256 partition files for better manageability. A partitioned index consists of:
118+
Large CDB64 indexes can be split across up to 256 partition files for better manageability. Records are partitioned by the first byte of the binary data item ID, represented as a hex prefix (00-ff). A partitioned index consists of:
119119

120120
- `manifest.json` - Describes all partitions and their locations
121121
- `00.cdb` through `ff.cdb` - Partition files (only populated prefixes exist)
@@ -151,7 +151,7 @@ If you need to create CDB64 indexes for specific data sets, the gateway includes
151151
# Generate from CSV file
152152
./tools/generate-cdb64-root-tx-index --input data.csv --output index.cdb
153153

154-
# Generate partitioned index
154+
# Generate partitioned index (creates manifest.json automatically)
155155
./tools/generate-cdb64-root-tx-index --input data.csv --partitioned --output-dir ./index/
156156

157157
# Export from local SQLite database
@@ -161,6 +161,8 @@ If you need to create CDB64 indexes for specific data sets, the gateway includes
161161
./tools/verify-cdb64 --index index.cdb --gateway https://arweave.net
162162
```
163163

164+
The `--partitioned` flag automatically shards records by ID prefix and generates the `manifest.json` with local file locations.
165+
164166
For high-throughput generation, a Rust-backed tool is also available:
165167

166168
```bash
@@ -178,7 +180,12 @@ You can upload partitioned CDB64 indexes to Arweave for permanent, decentralized
178180
--concurrency 5
179181
```
180182

181-
This enables sharing indexes with other gateway operators or creating redundant storage.
183+
This tool:
184+
1. Uploads each partition file to Arweave via Turbo
185+
2. Resolves the bundle IDs and byte offsets for each partition
186+
3. Updates the manifest with `arweave-bundle-item` locations
187+
188+
The resulting manifest can be shared with other gateway operators or uploaded to Arweave for decentralized index distribution.
182189

183190
## Performance Considerations
184191

0 commit comments

Comments
 (0)