Skip to content

Latest commit

 

History

History
329 lines (218 loc) · 27.1 KB

File metadata and controls

329 lines (218 loc) · 27.1 KB

Development Notes

Development Log — 2026-03-01

Features Implemented

  1. --ignore-failures — continues processing when decryption fails; logs errors with message identification (UID, From, Date, Subject) and reports them in the summary; exit code is non-zero if any failures occurred
  2. --move-failures — moves failed messages to a .failed sibling folder (e.g. INBOXINBOX.failed), creating the folder if needed; implies continuing on decryption errors
  3. --additional-privatekey / --additional-passphrase — repeatable options to specify multiple PEM key files; on decryption failure with the primary key, additional keys are tried in order if the error looks like a key-mismatch (heuristic based on openssl error message)
  4. Unencrypted key supportload_private_key() tries loading without a passphrase first; if the key is unencrypted, the passphrase argument is ignored
  5. Message identification on errorsextract_message_info() and format_message_id() extract From, Date, Subject from headers for all error messages
  6. Ctrl-C handling — see Ctrl-C / Signal Handling below
  7. Dryrun safety — dryrun mode makes no mailbox modifications at all: no APPEND, no STORE, no folder creation, no moves
  8. Skip \Deleted messages — messages already marked \Deleted (e.g. from a previous interrupted run) are skipped to allow safe re-runs
  9. --debug flag — prints timestamped trace output for every IMAP operation to diagnose performance issues
  10. --workers N — parallel decryption within each folder via dual-connection pipeline architecture (see Parallelism Architecture)
  11. --connections N — folder-level parallelism with independent IMAP connections (see Parallelism Architecture)
  12. Background progress ticker — live throughput display every 3 seconds with active folder list when --connections > 1
  13. Per-folder and overall throughput metrics — msg/s rate in progress output, per-folder breakdown, and wall-clock rate in summary
  14. Dual-connection pipeline — when --workers > 1, each folder uses two IMAP connections: a reader (FETCH on readonly SELECT) and a writer (APPEND + batch STORE \Deleted). The reader never blocks on write operations, keeping the decrypt worker pool saturated. The writer batches \Deleted flags across 10 messages, reducing SELECT/UNSELECT cycles by 10×. Increases throughput from ~32 msg/s to ~67 msg/s with --workers 32.

Known Issues and Workarounds

1. Dovecot Maildir Dotlock Contention (APPEND to same folder)

Problem: When a folder is SELECTed via IMAP, Dovecot holds a file-level dotlock on the Maildir. Any APPEND to the same folder (even from the same connection or a separate connection) must acquire the same lock. This causes:

  • Single connection: APPEND succeeds but the server sends unsolicited * N EXISTS / * N RECENT notifications. Python's imaplib accumulates these in its internal _untagged_response buffer, eventually corrupting the response parser and causing subsequent commands (STORE, FETCH) to hang indefinitely.
  • Dual connection: The second connection's APPEND blocks waiting for the first connection's dotlock. Dovecot logs show 27.149 in locks waits, dotlock overrides, and eventual disconnection of the main connection for inactivity.

Workaround: UNSELECT the folder before each APPEND, then re-SELECT for STORE. Per-message flow:

FETCH UID (RFC822)            → while folder is SELECTed
decrypt + reconstruct         → in memory
UNSELECT                      → releases dotlock, no EXPUNGE
APPEND decrypted message      → no competing lock
SELECT folder                 → re-open for STORE
STORE +FLAGS (\Deleted)       → mark original
(repeat for next message)
CLOSE                         → expunge all \Deleted at end

UIDs are persistent across UNSELECT/SELECT cycles so the pre-fetched UID list remains valid.

Rejected alternatives:

  • Three-phase batch (FETCH all → UNSELECT → batch APPEND → SELECT → batch STORE): avoids lock contention but holds all decrypted messages in memory (risk of OOM for large folders) and if interrupted between APPEND and STORE phases, leaves duplicates without originals marked \Deleted.
  • Per-message approach is safer: each message is fully processed (APPEND + STORE) before moving to the next, so interruption leaves at most one duplicate which is handled by the \Deleted skip logic on re-run.

Impact: Extra UNSELECT + SELECT per message adds ~1ms overhead. CLOSE at end expunges all \Deleted messages.

2. imaplib Unsolicited Response Handling (resolved by imapclient migration)

Problem: Python's imaplib did not properly consume unsolicited server responses (* EXISTS, * RECENT, * EXPUNGE, * FLAGS). These accumulated in imaplib._untagged_response and corrupted tagged response matching for subsequent commands.

Resolution: No longer applicable — imapclient handles unsolicited responses correctly. The UNSELECT-before-APPEND pattern (issue #1) is retained because it also addresses Dovecot dotlock contention independently of the IMAP library.

3. imaplib.IMAP4.append() Requires All 4 Arguments (resolved by imapclient migration)

Problem: The original imaplib-based code conditionally included date_time in the argument list. When internaldate was None, only 3 arguments were passed — final_message (bytes) was interpreted as the date_time parameter, causing a TypeError.

Resolution: No longer applicable — imapclient.append() uses keyword arguments (flags=, msg_time=) so argument ordering issues cannot occur.

4. Interrupted Runs Leave \Deleted Messages

Problem: If the script is interrupted (Ctrl-C, crash, or hung connection) after APPEND but before the user runs EXPUNGE, the folder contains both the decrypted copy and the original (marked \Deleted). On the next run, the original would be decrypted again, creating a duplicate.

Fix: Messages with \Deleted in their flags are skipped. Additionally, \Deleted is stripped from flags when APPENDing decrypted copies so the new message doesn't inherit the delete marker.

5. Flags Preserved Including \Deleted from Original

Problem: The decrypted APPEND was copying all original flags including \Deleted, so the new decrypted message was immediately marked for deletion.

Fix: \Deleted is filtered out of the flags list before APPEND.

6. \Recent Flag Rejected by APPEND

Problem: Dovecot rejects APPEND commands that include the \Recent system flag: BAD [Error in IMAP command APPEND: Invalid system flag \RECENT]. Per RFC 3501, \Recent is a server-managed flag — only the server can set it; clients cannot include it in APPEND.

Fix: Both APPEND paths (main decrypt and move_message_to_failed()) now filter out \Recent from the flags list before building the APPEND flags string.

7. VirtioFS Dotlock Performance

Problem: The mail is stored on /Volumes/Media/ which is bind-mounted into Docker via VirtioFS. Dotlock operations (create → link → unlink) traverse Container → Linux VM → VirtioFS → macOS → external volume, making metadata operations very slow. This causes "dotlock was overridden (locked 0 secs ago)" warnings and ~3s stalls per lock contention — even with no indexer-worker involved.

The dovecot-uidlist.lock is always a dotlock regardless of the lock_method setting (which was already fcntl). This is hardcoded in Dovecot's Maildir implementation.

Fix: Move index and control files (which contain the dotlock files) to the container's native filesystem by setting mail_index_path and mail_control_path in dovecot.conf:

mail_index_path = /tmp/dovecot-index/%{user | lower}
mail_control_path = /tmp/dovecot-control/%{user | lower}

This keeps the actual mail on the bind-mounted volume but puts all lock/index operations on fast native ext4 inside the container.

8. Dovecot FTS Indexer-Worker Dotlock Contention

Problem: After APPEND, Dovecot's indexer-worker fires asynchronously to index the new message (triggered by fts_autoindex = yes in the vendor FTS config). The indexer-worker and the IMAP process race on dovecot-uidlist.lock, causing "Our dotlock file … was overridden" warnings and "dotlock was immediately recreated under us" errors.

Attempted client-side mitigations (all insufficient):

  • time.sleep() after STORE — indexer can take unpredictable time
  • UNSELECT instead of CLOSE — avoids EXPUNGE but doesn't prevent indexer triggering on APPEND
  • Two-phase (APPEND then STORE) — indexer still races during APPEND phase
  • Three-phase (FETCH all, batch APPEND, batch STORE) — indexer still races between consecutive APPENDs

Root cause: fts_autoindex = yes in the vendor FTS config triggers the indexer-worker on every APPEND. This is a server-side issue that cannot be solved client-side.

Fix: Disable FTS auto-indexing in dovecot.conf by adding fts_autoindex = no after the !include_try directives to override the vendor default. The indexer can be triggered manually after migration is complete (doveadm index).

Note: process_limit = 0 for service indexer-worker was attempted but Dovecot 2.4.2 rejects it: process_limit must be higher than 0.

9. Ctrl-C / Signal Handling

Problem: Python's ThreadPoolExecutor registers an atexit handler (_python_exit()) that calls thread.join() on all worker threads. This means sys.exit() blocks indefinitely when pool threads are still running — even after shutdown(wait=False).

Solution — three layers:

  1. First Ctrl-C_handle_sigint() sets _interrupted flag via set_interrupted(). All processing loops check this flag and stop after completing the current in-progress message. Pending ThreadPoolExecutor futures are cancelled.

  2. Second Ctrl-C — calls os._exit(130) to terminate immediately, bypassing atexit handlers and stuck thread joins.

  3. Normal exitmain() uses os._exit(exit_code) instead of sys.exit() to avoid blocking on atexit handlers from lingering thread pool threads (both folder-level and inner decrypt-worker pools).

Additional measures for --connections > 1:

  • Folder-level pool uses explicit pool.shutdown(wait=False, cancel_futures=True) instead of context manager (with ThreadPoolExecutor() calls shutdown(wait=True) in __exit__)
  • Each _process_one_folder() worker checks is_interrupted() at the very top before connecting to IMAP, so queued futures bail out immediately
  • Inner decrypt worker pools in _process_parallel() also use pool.shutdown(wait=False)

10. Parallel Output Management

Problem: With --connections > 1, multiple threads writing per-message \r progress updates garble the terminal output. Interleaved Processing: headers and result lines from different folders are hard to read.

Solution — quiet_progress flag:

When --connections > 1, process_folder() receives quiet_progress=True which suppresses:

  • Per-message \r progress updates (e.g. [25/238] 28.6 msg/s — UID 25: decrypted)
  • "Processing N encrypted messages with M workers ..." banner
  • "Stopping early due to interrupt" messages (one per connection)
  • Final print(flush=True) newline after \r progress

What IS shown with --connections > 1:

  • Processing: FolderName ... header for each folder (thread-safe via _print_lock)
  • Per-folder result line showing total messages and encrypted count for every folder (plus decrypted count and msg/s rate for folders with encrypted messages)
  • Background ticker every 3 seconds showing aggregate throughput and active folder names
  • Error messages for decryption failures
  • Summary with wall-clock time, overall rate, per-connection rate, and per-folder breakdown for all processed folders

11. Active Folder Tracking

Problem: The background progress ticker needs to show which folders are actively being processed. Naively tracking from the start of _process_one_folder() shows folders that are still connecting or scanning (no encrypted messages) as "active".

Solution: The on_decrypt_start callback in process_folder() is invoked only when encrypted messages are found and decryption is about to begin, passing the encrypted count. decrypt-smime.py passes on_decrypt_start=lambda enc: _add_active_folder(display_name, enc) so the folder only appears in the active dict during actual decryption work, along with its total encrypted count for progress tracking. _remove_active_folder() in the finally block removes it when done (safe no-op if never added).

12. Immediate Scan Results Visibility

Problem: With --connections > 1, per-folder scan counts (total messages, encrypted count) were only printed after the entire folder finished processing. Folders with encrypted messages that took a long time to decrypt would show in the [active:] ticker but with no context about how many messages they had. This made it unclear whether a folder had work to do or was just slow.

Solution: The on_scan_complete callback in process_folder() is invoked immediately after the scan phase with (total_messages, encrypted_count). _process_one_folder() prints scan counts right away (e.g. Archives/2012: 2742 messages, 126 encrypted), then prints a separate decrypt result line when the folder finishes (e.g. Archives/2012: 124 decrypted, 19.7 msg/s). This gives immediate visibility into which folders have encrypted messages and how many, before decryption even begins.

13. imaplib.append() Does Not Quote Folder Names (resolved by imapclient migration)

Problem: Python's imaplib.IMAP4.append() passed the mailbox name directly to the IMAP command without quoting. For folders with spaces (e.g. My Folder), the server received APPEND My Folder (flags) ... and parsed My as the mailbox and Folder as the next argument, returning [TRYCREATE] Mailbox doesn't exist: My.

Resolution: No longer applicable — imapclient quotes folder names correctly in all operations including append(), select_folder(), and list_folders().

14. Python 3.12+ email.policy.default Rejects CR/LF in Address Headers

Problem: Python 3.12 introduced strict RFC 5322 validation in email.policy.default. When parsing message headers with this policy, accessing address fields (From, To, etc.) that contain CR or LF characters from folded headers raises ValueError: invalid arguments; address parts cannot contain CR or LF. This caused ERROR processing INBOX: invalid arguments; address parts cannot contain CR or LF when extract_message_info() or is_smime_encrypted() accessed headers on real-world messages with non-standard folding.

Fix: Switched both is_smime_encrypted() and extract_message_info() from email.policy.default to email.policy.compat32, which does not enforce strict address validation. Additionally, extract_message_info() now wraps each header access in try/except so that even if an individual header is malformed, the other fields are still extracted and processing continues with an <invalid {field} header> placeholder.

Note: The planned refactoring item "Modernise email.policy" (switch to email.policy.default) has been cancelled — compat32 is required for compatibility with real-world mail.

15. OpenSSL SMIME_read_ASN1_ex:no content type on Older Messages

Problem: Some older S/MIME encrypted messages (observed on 2012-era emails from pragmaticbookshelf.com) fail decryption with:

openssl cms -decrypt failed: Error reading SMIME Content Info
error:068000D1:asn1 encoding routines:SMIME_read_ASN1_ex:no content type:crypto/asn1/asn_mime.c:422:

OpenSSL's SMIME reader (SMIME_read_ASN1_ex) fails to parse the Content-Type header from the full RFC822 message. This can be caused by transport headers (long Received chains, unusual folding) confusing the parser, or by older S/MIME implementations using non-standard MIME formatting. The messages display fine in Thunderbird because Thunderbird extracts the PKCS7 payload directly rather than relying on OpenSSL's SMIME parser.

Fix: decrypt_smime_message() now uses a three-strategy fallback:

  1. Full message as SMIME (-inform SMIME): Original behaviour, works for most messages.
  2. Minimal SMIME wrapper (_build_minimal_smime()): Strips all transport/envelope headers (Received, Return-Path, DKIM, etc.) and keeps only MIME-Version, Content-Type, Content-Transfer-Encoding, and Content-Disposition — the only headers OpenSSL's SMIME reader needs. This fixes cases where extra headers confuse SMIME_read_ASN1_ex.
  3. Raw DER payload (_extract_pkcs7_der()): Extracts the PKCS7 binary payload by parsing the email with Python's email module (get_payload(decode=True) handles base64 decoding), then passes the raw DER bytes to openssl cms -decrypt -inform DER. This bypasses OpenSSL's MIME parsing entirely, similar to how Thunderbird handles it.

The fallback only triggers on "content type" / "no content" errors. Other errors (wrong key, bad decrypt, corrupted data) propagate immediately without attempting fallback strategies. The shared _run_openssl_decrypt() helper eliminates code duplication across all three strategies.

Dovecot Configuration Changes

The following changes to dovecot.conf are required for the decryption tool to work efficiently:

# Move index/control files to container-native filesystem (issues #7)
mail_index_path = /tmp/dovecot-index/%{user | lower}
mail_control_path = /tmp/dovecot-control/%{user | lower}

# Disable FTS auto-indexing (issue #8) — must be AFTER !include_try
fts_autoindex = no

After making these changes, restart Dovecot: docker compose restart dovecot

Refactoring to smime/ Package

Motivation: The single-file decrypt-smime.py grew to ~1170 lines, making it difficult to add parallelism and reason about individual concerns. Folders with thousands of messages need parallel decryption, so the architecture cleanly separates IMAP I/O from CPU/subprocess-bound work.

New structure:

File Lines Responsibility
decrypt-smime.py ~490 Entry point: signal handling, folder-level parallelism, progress ticker, summary
smime/cli.py ~60 argparse definitions including --workers and --connections
smime/imap.py ~150 All imapclient interaction: connect, login, folder ops, flag utilities, batch operations
smime/crypto.py ~450 Key loading, S/MIME detection, openssl cms decryption (with SMIME/DER fallback), message reconstruction — thread-safe
smime/processor.py ~760 Folder scanning, sequential and pipeline-parallel processing, IMAP replace/move, global decrypted counter

Parallelism Architecture

Thread safety design:

  • smime/crypto.py functions are thread-safe (no IMAP I/O, only openssl subprocesses and in-memory operations)
  • smime/imap.py functions are NOT thread-safe (all use single imapclient connection)
  • Each parallel folder gets its own pair of IMAPClient instances (reader + writer)

Two-level parallelism:

  1. --connections N — folder-level parallelism: N folders processed simultaneously, each on its own pair of IMAP connections. Safe because Dovecot dotlocks are per-folder, so different folders have independent locks.

    • Folders submitted incrementally (not all at once) so Ctrl-C stops new submissions immediately
    • Completed futures batch-drained to keep pool saturated and active-folder list accurate
    • Background ticker thread prints aggregate throughput every 3 seconds
  2. --workers N — within each folder, a dual-connection pipeline separates read and write I/O:

    • Reader (connection 1, readonly SELECT): continuously FETCHes messages → submits to ThreadPoolExecutor for decryption. Never blocks on write operations.
    • Workers (thread pool): up to N openssl cms -decrypt subprocesses run concurrently.
    • Writer (connection 2, dedicated thread): consumes completed decryptions from a queue → APPENDs each decrypted message → batch-STOREs \Deleted on original UIDs every 10 messages to amortise SELECT/UNSELECT overhead.
    • Memory bounded to ~workers + batch_size full messages per folder.
    • In --dryrun mode, no writer connection is opened — falls back to single-connection parallel path.

Both levels can be combined: --connections 5 --workers 32 runs 5 folders in parallel, each with 2 IMAP connections and 32 decrypt workers.

Throughput metrics:

  • Background ticker every 3s with per-folder progress: ⏱ 253 decrypted, 9s elapsed, 28.1 msg/s [Archives/2012 50/126, Sent 200/10721]
  • Per-folder scan result (immediate): Drafts: 112 messages, 0 encrypted or Sent: 11487 messages, 10721 encrypted
  • Per-folder decrypt result (on completion): Sent: 10721 decrypted, 33.5 msg/s
  • Summary per-folder breakdown shows all processed folders with total + encrypted counts

Performance observed (Dovecot 2.4.2 on Docker with VirtioFS, Mac mini M4):

Configuration Rate Notes
--workers 1 (sequential) ~4.3 msg/s Baseline, single connection
--workers 10 ~10 msg/s 2.3× speedup (single-connection pipeline)
--workers 32 ~32 msg/s 7.4× speedup (single-connection pipeline)
--workers 32 (dual-conn pipeline) ~67 msg/s 15.6× speedup, reader never blocked
--connections 5 --workers 32 ~47 msg/s Folder-level parallelism (pre-pipeline)

The previous bottleneck at higher worker counts was the sequential IMAP replace phase on the same connection (UNSELECT → APPEND → SELECT → STORE per message, ~30-230ms each). The dual-connection pipeline eliminates this by running FETCH on a readonly reader connection while a dedicated writer thread handles APPEND + batch STORE on a separate connection. The writer batches \Deleted flags across 10 messages, reducing SELECT/UNSELECT cycles by 10×.

Global Decrypted Counter

A thread-safe global counter in smime/processor.py (_global_decrypted with threading.Lock) is incremented at every successful decryption across all connections. This powers the background ticker in decrypt-smime.py without requiring cross-thread communication of per-folder results.

Functions: _increment_global_decrypted(), get_global_decrypted(), reset_global_decrypted().

Background Progress Ticker

When --connections > 1, a daemon thread _progress_ticker() prints aggregate throughput every 3 seconds with per-folder progress:

⏱ 253 decrypted, 9s elapsed, 28.1 msg/s  [Archives/2012 50/126, Sent 200/10721]

Each active folder shows decrypted/total so you can see individual folder progress and identify which folders are making headway. The active folder dict is tracked via _active_folders with callbacks:

  • on_decrypt_start — adds folder with encrypted count when decryption begins
  • on_message_decrypted — increments per-folder decrypted counter after each successful decrypt
  • _remove_active_folder() in the finally block removes the folder when done

The ticker is started before the folder pool and stopped in a finally block via _progress_stop threading Event. It uses _print_lock for thread-safe output.

Completed Refactoring

See plans/refactor-smime-plan.md for the original phased implementation plan.

Motivation

The original single-file decrypt-smime.py grew to ~1170 lines with significant duplication, manual IMAP response parsing via imaplib, and ad-hoc data structures. The refactoring simplified the codebase using imapclient, functional programming idioms, and Python standard library features.

Summary of Completed Changes

✅ Migrate from imaplib to imapclient

The single biggest simplification. imapclient eliminated ~150 lines of manual IMAP response parsing:

  • parse_list_response() → replaced by client.list_folders()
  • decode_modified_utf7() → handled transparently by imapclient
  • extract_flags_from_fetch(), extract_uid_from_fetch(), extract_internaldate_from_fetch() → FETCH returns pre-parsed dicts with typed values
  • format_imap_flags()imapclient.append() accepts flag lists natively
  • Folder quoting workarounds → imapclient quotes correctly (resolved Known Issues #2, #3, #13)

✅ Introduce MessageRecord dataclass

Replaced ad-hoc dicts with MessageRecord @dataclass. Provides IDE autocompletion, eliminates dict-key typo risks, and formalises the label field.

✅ Functional pattern refactors

Pattern Location Result
filter+map scan_folder() filter(None, map(_parse_item, data)) replaces while-loop
List comprehensions filter_encrypted() Two comprehensions replace manual loop+counter
itertools.chain reconstruct_message() Three header assembly loops collapsed into one chain
Precomputed frozenset _ENVELOPE_LOWER, _OVERRIDE_LOWER O(1) set lookup replaces O(n) list scan
TemporaryDirectory decrypt_smime_message() Auto-cleaned temp dir replaces manual lifecycle
Dict comprehension reconstruct_message() override_map Walrus operator dict comprehension

✅ DRY refactors in orchestration

Pattern Location Result
Shared error handler _handle_message_outcome() Extracted ~60 lines of shared decision tree
Shared clean_flags() clean_flags() Extracted to smime/imap.py utility
_submit_next() helper _submit_next() Three identical copies → one function
_accumulate() helper _accumulate() Two identical result-accumulation blocks → one function

Modernise email.policy

Switch from email.policy.compat32 to email.policy.default in smime/crypto.py for cleaner header access.

Cancelled — see Known Issue #14. email.policy.default enforces strict RFC 5322 validation on address headers, which fails on real-world messages containing CR/LF in folded headers. The compat32 policy is required for compatibility.