-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
Worker Identification and Logger Binding
Problem
Currently, download workers are anonymous. When running with max_workers=3, logs show interleaved messages from all workers without an easy way to distinguish which worker is doing what, other than by the URL they are processing.
DEBUG | Downloading https://example.com/file1.zip
DEBUG | Downloading https://example.com/file2.zip <-- Which worker?
DEBUG | Download completed ...
Proposed Solution
Assign a unique ID (e.g., worker-1, worker-2) to each worker instance and bind it to the logger.
Implementation
- Modify
WorkerPool.start()to pass an index/ID tocreate_worker:
for i in range(self._max_workers):
worker = self.create_worker(client, worker_id=i)- Update
WorkerPool.create_worker()to bind the logger:
def create_worker(self, client: ClientSession, worker_id: int) -> BaseWorker:
# Bind worker_id to the logger context
worker_logger = self._logger.bind(worker_id=f"worker-{worker_id}")
# Pass bound logger to factory
# DownloadWorker doesn't need code changes, it just uses the logger provided
worker = self._worker_factory(client, worker_logger, emitter)
return workerBenefits
- Better Debugging: Easily filter logs by worker ID.
- Concurrency Visibility: Clearer picture of how tasks are distributed.
- Traceability: Follow a single worker's lifecycle through multiple downloads.