Skip to content

Commit 5226d39

Browse files
committed
Merge branch 'dev' into feat_benchmarks
2 parents 8633369 + eef34e8 commit 5226d39

File tree

12 files changed

+595
-76
lines changed

12 files changed

+595
-76
lines changed

Cargo.toml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,16 +50,16 @@ arrow-schema = { version = "56.2.0", features = ["serde"] }
5050
arrow-select = "56.2.0"
5151
crc32c = "0.6"
5252
crossbeam-skiplist = "0.1"
53-
fusio = { version = "0.5.0-a0", default-features = false, features = [
53+
fusio = { version = "0.5.0", default-features = false, features = [
5454
"aws",
5555
"dyn",
5656
"executor",
5757
"fs",
5858
] }
59-
fusio-manifest = { version = "0.5.0-a0", package = "fusio-manifest", default-features = false, features = [
59+
fusio-manifest = { version = "0.5.0", package = "fusio-manifest", default-features = false, features = [
6060
"std",
6161
] }
62-
fusio-parquet = { version = "0.5.0-a0", package = "fusio-parquet" }
62+
fusio-parquet = { version = "0.5.0", package = "fusio-parquet" }
6363
futures = "0.3"
6464
lockable = "0.2"
6565
once_cell = "1"

docs/overview.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -650,6 +650,10 @@ Tonbo runs as a **manifest-orchestrated** system over **object storage**. The **
650650
- **Read path:** Resolve manifest snapshot -> parallel, columnar scans with pushdown -> optional local caching.
651651
- **Compaction path (background):** Select L/L+1 SSTs -> merge -> upload new SSTs -> **CAS manifest update** -> GC old objects.
652652

653+
**Compaction defaults**
654+
- **Minor compaction:** enabled by default; flushes once ~4 sealed immutables accumulate, emitting L0 SSTs. Opt-out via `DbBuilder::disable_minor_compaction` is intended for tests/bulk-load tooling only.
655+
- **Major compaction:** not scheduled automatically yet; invoke the admin trigger or an opt-in loop when available. The planner/executor path is wired, but background scheduling is still landing.
656+
653657
### Why this fits Edge Compute & Shared Storage
654658

655659
- **Immutable by design:** write-new-only + **conditional commits** match object storage semantics—no in-place mutation.
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Build artifacts
2+
/target/
3+
/build/
4+
5+
# Wrangler
6+
.wrangler/
7+
8+
# Local development secrets (never commit real credentials!)
9+
.dev.vars
10+
11+
# Rust lock file (optional, can be committed for reproducibility)
12+
# Cargo.lock
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
[package]
2+
name = "tonbo-cloudflare-worker"
3+
version = "0.1.0"
4+
edition = "2024"
5+
publish = false
6+
7+
# Standalone workspace (not part of parent tonbo workspace)
8+
[workspace]
9+
10+
[lib]
11+
crate-type = ["cdylib"]
12+
13+
[dependencies]
14+
# Tonbo with web (WASM) features
15+
tonbo = { path = "../..", default-features = false, features = ["web"] }
16+
17+
# Fusio for WebExecutor and AmazonS3 types
18+
fusio = { version = "0.5.0", default-features = false, features = [
19+
"aws",
20+
"executor-web",
21+
] }
22+
23+
# Arrow for RecordBatch creation
24+
arrow-array = "56.2.0"
25+
arrow-schema = "56.2.0"
26+
27+
# Cloudflare Workers runtime
28+
worker = "0.7"
29+
console_error_panic_hook = "0.1"
30+
31+
# getrandom needs wasm_js feature for WASM
32+
getrandom = { version = "0.3", features = ["wasm_js"] }
33+
34+
# For async streams
35+
futures = "0.3"
36+
37+
# Size optimizations - Cloudflare Workers have a 10MB limit
38+
[profile.release]
39+
opt-level = "z" # Optimize for size
40+
lto = "fat" # Full link-time optimization
41+
strip = "symbols" # Strip debug symbols
42+
codegen-units = 1 # Better optimization (slower compile)
43+
panic = "abort" # Smaller panic handling
Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
# Tonbo on Cloudflare Workers
2+
3+
This example demonstrates running Tonbo as an embedded database on [Cloudflare Workers](https://workers.cloudflare.com/), using either Cloudflare R2 or any S3-compatible storage backend.
4+
5+
## Prerequisites
6+
7+
- [Rust](https://rustup.rs/) (1.90+ recommended)
8+
- [Node.js](https://nodejs.org/) (for wrangler CLI)
9+
- [wrangler](https://developers.cloudflare.com/workers/wrangler/install-and-update/) CLI: `npm install -g wrangler`
10+
- For local testing: [Docker](https://docker.com/) (to run LocalStack)
11+
12+
## Quick Start
13+
14+
### 1. Local Testing with LocalStack
15+
16+
Start LocalStack (S3-compatible local storage):
17+
18+
```bash
19+
docker run -d --name localstack -p 4566:4566 -e SERVICES=s3 localstack/localstack:3.0.2
20+
```
21+
22+
Create a test bucket:
23+
24+
```bash
25+
AWS_ACCESS_KEY_ID=test AWS_SECRET_ACCESS_KEY=test \
26+
aws --endpoint-url=http://localhost:4566 s3api create-bucket \
27+
--bucket tonbo-test --region us-east-1
28+
```
29+
30+
Set CORS (required for browser/WASM access):
31+
32+
```bash
33+
AWS_ACCESS_KEY_ID=test AWS_SECRET_ACCESS_KEY=test \
34+
aws --endpoint-url=http://localhost:4566 s3api put-bucket-cors \
35+
--bucket tonbo-test \
36+
--cors-configuration '{"CORSRules":[{"AllowedOrigins":["*"],"AllowedMethods":["GET","PUT","HEAD","DELETE"],"AllowedHeaders":["*"],"ExposeHeaders":["ETag"],"MaxAgeSeconds":300}]}'
37+
```
38+
39+
Create `.dev.vars` with test credentials:
40+
41+
```bash
42+
cat > .dev.vars << 'EOF'
43+
TONBO_S3_ACCESS_KEY=test
44+
TONBO_S3_SECRET_KEY=test
45+
EOF
46+
```
47+
48+
Run the worker locally:
49+
50+
```bash
51+
npx wrangler dev
52+
```
53+
54+
Test it:
55+
56+
```bash
57+
# Write and read back data
58+
curl -X POST http://localhost:8787/write
59+
# Output:
60+
# Wrote 2 rows (alice=100, bob=200) to Tonbo DB.
61+
# Read back: alice = 100
62+
# Note: Cloudflare Workers have subrequest limits...
63+
```
64+
65+
### 2. Deploy to Cloudflare with R2
66+
67+
#### Create R2 bucket via CLI
68+
69+
```bash
70+
# Login to Cloudflare (opens browser for authentication)
71+
npx wrangler login
72+
73+
# Create a new R2 bucket
74+
npx wrangler r2 bucket create your-bucket-name
75+
76+
# Note your account ID from the output, or get it with:
77+
npx wrangler whoami
78+
```
79+
80+
#### Create R2 API Token (Dashboard required)
81+
82+
R2 API tokens must be created in the Cloudflare dashboard:
83+
84+
1. Go to [Cloudflare Dashboard](https://dash.cloudflare.com/) > R2 > Overview
85+
2. Click "Manage R2 API Tokens" in the right sidebar
86+
3. Click "Create API token"
87+
4. Configure the token:
88+
- **Token name**: e.g., "tonbo-worker"
89+
- **Permissions**: Object Read & Write
90+
- **Specify bucket(s)**: Select your bucket (e.g., "your-bucket-name")
91+
5. Click "Create API Token"
92+
6. **IMPORTANT**: Copy both the Access Key ID and Secret Access Key immediately (the secret is only shown once!)
93+
94+
#### Configure wrangler.toml
95+
96+
Update `wrangler.toml` with your R2 endpoint:
97+
98+
```toml
99+
[vars]
100+
TONBO_S3_ENDPOINT = "https://YOUR_ACCOUNT_ID.r2.cloudflarestorage.com"
101+
TONBO_S3_BUCKET = "your-bucket-name"
102+
TONBO_S3_REGION = "auto"
103+
```
104+
105+
Your Account ID can be found in:
106+
- R2 overview page URL: `dash.cloudflare.com/ACCOUNT_ID/r2/...`
107+
- Or run: `npx wrangler whoami`
108+
109+
#### Set secrets
110+
111+
```bash
112+
npx wrangler secret put TONBO_S3_ACCESS_KEY
113+
# Paste your R2 Access Key ID when prompted
114+
115+
npx wrangler secret put TONBO_S3_SECRET_KEY
116+
# Paste your R2 Secret Access Key when prompted
117+
```
118+
119+
#### Deploy
120+
121+
```bash
122+
npx wrangler deploy
123+
```
124+
125+
#### Test the deployment
126+
127+
```bash
128+
# Write data and read it back
129+
curl -X POST https://your-worker.your-subdomain.workers.dev/write
130+
131+
# Expected output:
132+
# Wrote 2 rows (alice=100, bob=200) to Tonbo DB.
133+
# Read back: alice = 100
134+
# Note: Cloudflare Workers have subrequest limits...
135+
```
136+
137+
## How It Works
138+
139+
This example uses Tonbo with:
140+
141+
- **WebExecutor**: Fusio's executor for browser/WASM environments
142+
- **AmazonS3**: S3-compatible filesystem backend (works with R2, LocalStack, MinIO, etc.)
143+
- **Arrow RecordBatch**: For efficient columnar data storage
144+
145+
The worker demonstrates:
146+
1. **POST /write** - Opens database, inserts 2 rows, reads one back (in same request)
147+
2. **GET /read** - Opens database, queries specific keys
148+
3. **GET /debug** - Lists files in S3/R2 bucket
149+
150+
## Important Limitations
151+
152+
### Cloudflare Workers Subrequest Limit
153+
154+
Cloudflare Workers limit HTTP subrequests per invocation. Tonbo operations (opening database, reading manifests, WAL replay, scanning data) each require HTTP requests to S3/R2.
155+
156+
| Plan | Subrequests | Tonbo Compatibility |
157+
|------|-------------|---------------------|
158+
| Free | 50 | Limited - write+read in same request |
159+
| Paid | 1,000 | Good - separate operations should work |
160+
161+
**On the free tier (50 subrequests):**
162+
- Write + single read works in one request ✓
163+
- Multiple reads may exceed the limit ✗
164+
- Separate read requests may fail if state accumulated ✗
165+
166+
**On the paid tier (1,000 subrequests):**
167+
- Most Tonbo operations should work fine
168+
- Separate read requests after writes should work
169+
- Consider compaction to reduce files over time for best performance
170+
171+
**Workarounds for free tier:**
172+
- Do write and immediate verification read in the same request
173+
- Use Cloudflare Durable Objects for more complex operations
174+
- Upgrade to paid plan for production use
175+
176+
## Size Optimization
177+
178+
Cloudflare Workers have a 10MB size limit. This example includes optimizations in `Cargo.toml`:
179+
180+
```toml
181+
[profile.release]
182+
opt-level = "z" # Optimize for size
183+
lto = "fat" # Full link-time optimization
184+
strip = "symbols" # Strip debug symbols
185+
codegen-units = 1 # Better optimization
186+
panic = "abort" # Smaller panic handling
187+
```
188+
189+
Current build size: ~6MB (well under the limit).
190+
191+
## Troubleshooting
192+
193+
### "SignatureDoesNotMatch" errors
194+
195+
This usually means CORS isn't configured correctly, or the S3 endpoint isn't accessible from the worker. Ensure:
196+
- CORS is configured on the bucket (see setup steps above)
197+
- The endpoint URL is correct
198+
- Credentials are set properly
199+
200+
### Build errors with fusio
201+
202+
This example requires a patched version of fusio with WASM fixes. See [fusio PR #257](https://github.com/tonbo-io/fusio/pull/257). The `[patch.crates-io]` section in `Cargo.toml` handles this.
203+
204+
### WAL persistence limitation
205+
206+
Currently, small files (like WAL segments < 5MB) are only persisted to S3 on `close()`, not on `flush()`. This means:
207+
- Write + read in the **same request** works (data in memory)
208+
- Write in one request, read in another may not see the data if `close()` wasn't called
209+
210+
This is a known limitation in fusio's S3Writer. A fix is needed to make `flush()` persist small files, but it requires careful handling of the append code path. See the fusio repository for updates.
211+
212+
### "error: executor-web is only supported on wasm32 targets"
213+
214+
This happens when running `cargo check` without the wasm32 target. Use:
215+
216+
```bash
217+
cargo check --target wasm32-unknown-unknown
218+
```
219+
220+
Or just use `npx wrangler build` which handles this automatically.
221+
222+
## Project Structure
223+
224+
```
225+
examples/cloudflare-worker/
226+
├── Cargo.toml # Dependencies and build config
227+
├── wrangler.toml # Cloudflare Workers configuration
228+
├── src/
229+
│ └── lib.rs # Worker implementation
230+
├── .dev.vars # Local development secrets (git-ignored)
231+
└── .gitignore
232+
```
233+
234+
## Learn More
235+
236+
- [Tonbo Documentation](https://docs.rs/tonbo)
237+
- [Cloudflare Workers Documentation](https://developers.cloudflare.com/workers/)
238+
- [Cloudflare R2 Documentation](https://developers.cloudflare.com/r2/)
239+
- [Fusio Documentation](https://docs.rs/fusio)

0 commit comments

Comments
 (0)