Background
Woodpecker currently supports log creation, append, read, and recovery workflows, but lacks a complete log deletion lifecycle.
In real-world scenarios, upper-layer systems may need to:
- release inactive logs
- stop serving append/read requests
- reclaim local storage resources
- clean up remote object storage data
- remove metadata references
However, directly deleting all resources synchronously is risky and difficult to coordinate across distributed components.
This issue proposes introducing a mark-delete based asynchronous cleanup workflow for logs.
Goals
Introduce a delete-log API and lifecycle state to support:
- marking logs as deleted
- preventing further read/write operations
- asynchronously cleaning up server-side local resources
- allowing client-side cleanup of metadata and object storage data
- ensuring crash-safe and retry-safe deletion semantics
Proposed Design
- Delete Log Request
Introduce a delete log RPC/API:
DeleteLog(logID)
The delete request should:
- mark the log as deleted
- reject future append/read/open requests
- trigger asynchronous cleanup tasks
Deletion should be idempotent.
- Mark-Delete First
Deletion should NOT immediately remove all resources.
Instead:
ACTIVE -> DELETING -> DELETED
The system first marks the log as:
DELETING
Then asynchronously performs cleanup.
Benefits:
- crash-safe
- retry-safe
- avoids partial delete inconsistencies
- easier recovery handling
- Responsibility Split
Server-side responsibilities
Woodpecker server should primarily handle:
- stop in-memory readers/writers
- reject new append/read requests
- close active file handles
- remove local fragment/WAL files
- clean local caches and runtime states
Core requirement:
Once a log enters DELETING state,
the server must no longer serve it.
Client-side responsibilities
Client side should handle:
- metadata cleanup
- object storage cleanup (S3/GCS/etc)
- upper-layer catalog cleanup
This avoids coupling object storage lifecycle into the Woodpecker server runtime.
Cleanup Workflow
Suggested async flow:
DeleteLog()
-> mark DELETING
-> stop runtime access
-> enqueue cleanup task
-> cleanup local storage
-> release runtime resources
-> client cleans metadata/object storage
-> mark DELETED
Important Questions / Open Discussions
- Reader/Writer Shutdown Semantics
Need to define behavior for:
- active appenders
- active readers
- blocked reads
- inflight writes
Questions:
- should inflight writes fail immediately?
- should readers receive EOF?
- should force-close happen synchronously?
- Recovery Semantics
After broker/server restart:
- how should DELETING logs recover?
- should cleanup resume automatically?
- should partially cleaned logs continue cleanup?
Need a resumable cleanup mechanism.
- Metadata Persistence
Need persistent lifecycle state:
ACTIVE
DELETING
DELETED
Question:
- where should this state live?
- local metadata?
- external metadata store?
- object storage manifest?
- Garbage Collection Retry
Cleanup operations may fail due to:
- file lock
- S3 temporary errors
- network failures
Need retry-safe GC semantics.
- Idempotency
DeleteLog must be fully idempotent.
Repeated delete requests should:
- not corrupt state
- not fail unexpectedly
- safely resume cleanup
Future Extensions
Potential future improvements:
- delayed delete / TTL delete
- reference-count-based delete
- namespace-level GC
- background compaction/cleanup workers
- soft delete retention window
- force delete mode
Expected Outcome
After implementation:
- logs can be safely retired
- runtime resources are reclaimed correctly
- local storage usage is bounded
- upper-layer systems can independently manage metadata/object storage cleanup
- deletion becomes crash-safe and operationally manageable
Background
Woodpecker currently supports log creation, append, read, and recovery workflows, but lacks a complete log deletion lifecycle.
In real-world scenarios, upper-layer systems may need to:
However, directly deleting all resources synchronously is risky and difficult to coordinate across distributed components.
This issue proposes introducing a mark-delete based asynchronous cleanup workflow for logs.
Goals
Introduce a delete-log API and lifecycle state to support:
Proposed Design
Introduce a delete log RPC/API:
DeleteLog(logID)
The delete request should:
Deletion should be idempotent.
Deletion should NOT immediately remove all resources.
Instead:
ACTIVE -> DELETING -> DELETED
The system first marks the log as:
DELETING
Then asynchronously performs cleanup.
Benefits:
Server-side responsibilities
Woodpecker server should primarily handle:
Core requirement:
Once a log enters DELETING state,
the server must no longer serve it.
Client-side responsibilities
Client side should handle:
This avoids coupling object storage lifecycle into the Woodpecker server runtime.
Cleanup Workflow
Suggested async flow:
DeleteLog()
-> mark DELETING
-> stop runtime access
-> enqueue cleanup task
-> cleanup local storage
-> release runtime resources
-> client cleans metadata/object storage
-> mark DELETED
Important Questions / Open Discussions
Need to define behavior for:
Questions:
After broker/server restart:
Need a resumable cleanup mechanism.
Need persistent lifecycle state:
ACTIVE
DELETING
DELETED
Question:
Cleanup operations may fail due to:
Need retry-safe GC semantics.
DeleteLog must be fully idempotent.
Repeated delete requests should:
Future Extensions
Potential future improvements:
Expected Outcome
After implementation: