Couchbase Lite Operations Hot Path Analysis

Executive Summary

Problem: Writing to Couchbase Lite is expensive. Each checkpoint save can trigger multiple CBL operations.

Finding: The checkpoint save is the primary CBL operation in the hot path, happening every N documents (configurable).

Hot Code Path: `_process_changes_batch()` → `checkpoint.save()`

Location

Entry: rest/changes_http.py:843-1477 - _process_changes_batch()
Checkpoint Call: rest/changes_http.py:1284, 1331, 1467, 1477
Checkpoint Class: main.py:2180-2450 - Checkpoint class

CBL Operations Per Checkpoint Save (Fallback Path)

When SG checkpoint save fails (network error, unavailable, etc.), the system falls back to CBL storage.

Call Chain

checkpoint.save() → tries SG (PUT {keyspace}/_local/checkpoint-{uuid})
On Exception → _save_fallback(seq)
_save_fallback() → CBLStore().save_checkpoint(uuid, seq, client_id)
storage/cbl_store.py:816-839

Operations per `save_checkpoint()` call:

1 × GET (mutable)   [_coll_get_mutable_doc @ L820]
    └─ CBLCollection_GetMutableDocument()
    
1 × SAVE            [_coll_save_doc @ L827]
    └─ CBLCollection_SaveDocumentWithConcurrencyControl()

Total: 2 CBL operations per fallback checkpoint save

Checkpoint Save Frequency (Configuration-Dependent)

Sequential Mode with `every_n_docs`

rest/changes_http.py:1198-1287

if every_n_docs > 0 and sequential:
    for i in range(0, len(filtered), every_n_docs):
        sub_batch = filtered[i : i + every_n_docs]
        # ... process sub_batch ...
        await checkpoint.save(since, ...)  # AFTER each sub-batch

Saves per batch: len(filtered) / every_n_docs (rounded up)

Sequential Mode with Stride (Default)

rest/changes_http.py:1290-1340

else:
    # Sequential checkpoint stride: save every N docs rather than
    # after every single doc
    checkpoint_stride = proc_cfg.get("checkpoint_stride", 100)

Saves per batch: len(filtered) / checkpoint_stride

Parallel Mode

rest/changes_http.py:1345-1400

# Wait for all tasks to complete, then checkpoint once
tasks = [...]
await asyncio.gather(*tasks)
await checkpoint.save(since, ...)  # ONCE per entire batch

Saves per batch: 1 (single checkpoint after entire parallel batch completes)

Empirical Impact Example

Scenario: Default Sequential Mode

Batch size: 100 documents
checkpoint_stride: 100 (default)
SG availability: 10% (fails 90% of time, triggering fallback)

Per batch:

1 checkpoint save
~90% trigger fallback path
0.9 × 2 CBL ops = 1.8 CBL operations per batch

Scenario: Sequential with `every_n_docs=10`

Batch size: 100 documents
every_n_docs: 10
SG availability: 10%

Per batch:

10 checkpoint saves (100 / 10)
~90% trigger fallback per save
10 saves × 0.9 fallback rate × 2 CBL ops = 18 CBL operations per batch

⚠️ 10x more expensive than default stride mode!

CBL Operation Details

Operation 1: GET (Mutable)

storage/cbl_store.py:199-209

def _coll_get_mutable_doc(db, collection_name: str, doc_id: str):
    coll = _get_collection(db, CBL_SCOPE, collection_name)
    doc_ref = lib.CBLCollection_GetMutableDocument(
        coll, stringParam(doc_id), _cbl_gError
    )
    # ...

Cost: Medium

Collection lookup (O(1))
Document fetch from CBL storage
Tracked as: "operation": "SELECT" in logs

Operation 2: SAVE

storage/cbl_store.py:212-223

def _coll_save_doc(db, collection_name: str, doc) -> None:
    coll = _get_collection(db, CBL_SCOPE, collection_name)
    doc._prepareToSave()
    ok = lib.CBLCollection_SaveDocumentWithConcurrencyControl(
        coll,
        doc._ref,
        0,  # kCBLConcurrencyControlLastWriteWins
        _cbl_gError,
    )

Cost: High

Serialization (_prepareToSave())
Disk I/O (CBL database file write)
Concurrency control check
Tracked as: "operation": "INSERT" or "operation": "UPDATE"

Load Path (On Startup)

main.py:2250-2333 - Checkpoint.load()

When SG checkpoint fails to load (network error), fallback calls: main.py:2408-2432

def _load_fallback(self) -> str:
    if USE_CBL:
        data = self._get_fallback_store().load_checkpoint(self._uuid)

Which does: storage/cbl_store.py:780-814

def load_checkpoint(self, uuid: str) -> dict | None:
    doc = _coll_get_doc(self.db, COLL_CHECKPOINTS, doc_id)  # 1 GET

Cost: 1 CBL operation (GET) during startup

Configuration Recommendations to Reduce CBL Ops

Priority 1: Increase `checkpoint_stride`

Default: 100 → Recommended: 500-1000

{
  "processing": {
    "checkpoint_stride": 500
  }
}

Impact: Reduces checkpoint saves by 5-10×

100 doc batch: 1 save instead of ~10 saves
If in fallback mode: 2 CBL ops instead of 20

Priority 2: Ensure SG Connectivity

The fallback only triggers on SG errors. Maintain 100% SG availability.

Current health check: None. Consider adding SG heartbeat monitoring.

Priority 3: Use Parallel Mode

Sequential mode can checkpoint more frequently.

{
  "processing": {
    "sequential": false,
    "max_concurrent": 10
  }
}

Impact: Single checkpoint per batch (regardless of size)

Priority 4: Disable Fallback (If SG Always Available)

If you guarantee SG is always reachable, disable fallback:

{
  "checkpoint": {
    "enabled": false
  }
}

Impact: Zero CBL operations (no checkpoint storage)

Metrics to Monitor

All checkpoint operations are tracked:

Metric	Path	Meaning
`checkpoint_saves_total`	rest/changes_http.py:1286	How many checkpoint saves attempted
`checkpoint_load_errors_total`	main.py:2318, 2331	SG load failures triggering fallback
`checkpoint_save_errors_total`	main.py:2398	SG save failures triggering fallback

Log Events (storage/cbl_store.py):

"CBL checkpoint saved" with operation="INSERT" or operation="UPDATE"
"CBL checkpoint loaded" with operation="SELECT"
Duration in duration_ms

Summary Table

Scenario	Checkpoints/Batch	CBL Ops/Checkpoint	Total CBL Ops/Batch
Parallel (100 docs)	1	0 (SG success)	0
Parallel (100 docs, SG fails)	1	2 (fallback)	2
Sequential (stride=100)	1	2 (fallback)	2
Sequential (stride=10)	10	2 each (fallback)	20 ⚠️
Sequential (stride=1)	100	2 each (fallback)	200 🔴

Recommendation: Use Parallel mode or increase checkpoint_stride to 500+.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Couchbase Lite Operations Hot Path Analysis

Executive Summary

Hot Code Path: `_process_changes_batch()` → `checkpoint.save()`

Location

CBL Operations Per Checkpoint Save (Fallback Path)

Call Chain

Operations per `save_checkpoint()` call:

Checkpoint Save Frequency (Configuration-Dependent)

Sequential Mode with `every_n_docs`

Sequential Mode with Stride (Default)

Parallel Mode

Empirical Impact Example

Scenario: Default Sequential Mode

Scenario: Sequential with `every_n_docs=10`

CBL Operation Details

Operation 1: GET (Mutable)

Operation 2: SAVE

Load Path (On Startup)

Configuration Recommendations to Reduce CBL Ops

Priority 1: Increase `checkpoint_stride`

Priority 2: Ensure SG Connectivity

Priority 3: Use Parallel Mode

Priority 4: Disable Fallback (If SG Always Available)

Metrics to Monitor

Summary Table

FilesExpand file tree

CBL_OPERATIONS_ANALYSIS.md

Latest commit

History

CBL_OPERATIONS_ANALYSIS.md

File metadata and controls

Couchbase Lite Operations Hot Path Analysis

Executive Summary

Hot Code Path: _process_changes_batch() → checkpoint.save()

Location

CBL Operations Per Checkpoint Save (Fallback Path)

Call Chain

Operations per save_checkpoint() call:

Checkpoint Save Frequency (Configuration-Dependent)

Sequential Mode with every_n_docs

Sequential Mode with Stride (Default)

Parallel Mode

Empirical Impact Example

Scenario: Default Sequential Mode

Scenario: Sequential with every_n_docs=10

CBL Operation Details

Operation 1: GET (Mutable)

Operation 2: SAVE

Load Path (On Startup)

Configuration Recommendations to Reduce CBL Ops

Priority 1: Increase checkpoint_stride

Priority 2: Ensure SG Connectivity

Priority 3: Use Parallel Mode

Priority 4: Disable Fallback (If SG Always Available)

Metrics to Monitor

Summary Table

Hot Code Path: `_process_changes_batch()` → `checkpoint.save()`

Operations per `save_checkpoint()` call:

Sequential Mode with `every_n_docs`

Scenario: Sequential with `every_n_docs=10`

Priority 1: Increase `checkpoint_stride`