Skip to content

Commit 7994c64

Browse files
Add improvement recommendations and roadmap for czkawka ecosystem
Verified task list covering czkawka-core, czkawka-cli, and kalka, grouped into quick wins, medium, and deep refactors with effort estimates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 937a639 commit 7994c64

1 file changed

Lines changed: 71 additions & 0 deletions

File tree

IMPROVEMENTS_RECOMMENDATIONS.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Improvements for `czkawka-core`, `czkawka-cli`, and `kalka`
2+
3+
Verified against the current source. Items marked ~~strikethrough~~ were previously listed but confirmed not to be issues.
4+
5+
## Summary
6+
7+
The three highest-value improvements are:
8+
1. Fix Kalka scan lifecycle bugs (stop flow, stderr handling).
9+
2. Define a stable CLI JSON results contract.
10+
3. Add cross-crate integration tests for that contract.
11+
12+
## `czkawka-core`
13+
14+
- ~258 instances of `.unwrap()`/`panic!`/`.expect()` in non-test code, some in user-facing algorithm paths (`similar_images/core.rs:258,401,412,425`). Convert recoverable ones to typed errors.
15+
- Explicit `TODO` at `similar_images/core.rs:523` — reference-folder verification is not trusted.
16+
- Newer features (fuzzy-name matching, `--no-self-compare`, similar documents, reference paths) lack test coverage.
17+
18+
## `czkawka-cli`
19+
20+
- Per-command dispatcher in `main.rs` duplicates setup/save/exit/progress wiring for every tool.
21+
- No integration tests for subcommand argument parsing, exit codes, or JSON output.
22+
- Warnings and diagnostics are logged but not included in machine-readable output.
23+
24+
## `kalka`
25+
26+
- **Stop-scan bug** (`backend.py:456-465`): `_check_stop_cleanup()` hardcodes `finished.emit(ActiveTab.DUPLICATE_FILES, [])` regardless of the active tool. Corrupts UI state.
27+
- **Lost stderr**: non-JSON stderr lines from czkawka_cli are silently dropped (`backend.py:153-156`). Permission errors, skipped files, etc. never reach the user.
28+
- **Missing Similar Documents tab**: CLI has `similar-docs` but `models.py:ActiveTab` has no entry.
29+
- **`QTreeWidget` scaling**: `results_view.py` rebuilds the entire tree on every `set_results()`. Works with batch insert + signal blocking, but won't scale to large result sets.
30+
- **Scan-state flags**: `AppState` uses loose booleans (`scanning`, `processing`, `stop_requested`) instead of a state machine.
31+
32+
### Not an issue
33+
34+
- ~~CLI argument construction for repeated values.~~ Comma-joining paths works correctly — `clap` parses comma-separated values into `Vec<PathBuf>`.
35+
36+
---
37+
38+
## Roadmap
39+
40+
| Phase | Focus | Effort | Depends on |
41+
|-------|-------|--------|------------|
42+
| 1 | Stabilize Kalka lifecycle | 1–2 days ||
43+
| 2 | Stable CLI JSON contract | 3–5 days ||
44+
| 3 | Integration tests for CLI contract | 2–3 days | Phase 2 |
45+
| 4 | Core runtime robustness | 3–5 days ||
46+
| 5 | Kalka scalability (QTreeView, new tabs) | 5–8 days | Phase 2 |
47+
48+
---
49+
50+
## Task List
51+
52+
### Quick wins (< 1 day each)
53+
54+
- [ ] **Fix stop-scan cleanup** — store the active tab at scan start, use it in `_check_stop_cleanup()` instead of hardcoded `DUPLICATE_FILES`. (`kalka/app/backend.py`)
55+
- [ ] **Surface stderr diagnostics** — collect non-JSON stderr lines during scan, display them in the bottom panel on completion. (`kalka/app/backend.py`)
56+
- [ ] **Add Similar Documents to Kalka** — new `ActiveTab` entry, `TAB_TO_CLI_COMMAND` mapping, column definitions. (`kalka/app/models.py`, `left_panel.py`)
57+
- [ ] **Resolve reference-folder TODO** — investigate and either fix or document the limitation at `similar_images/core.rs:523`.
58+
59+
### Medium (1–3 days each)
60+
61+
- [ ] **Stable JSON results envelope** — add `schema_version`, `tool_type`, `messages` wrapper to CLI output. Optional `--json-results-stdout` mode. (`czkawka_cli/src/main.rs`)
62+
- [ ] **CLI integration tests** — per-subcommand tests covering argument parsing, exit codes, and JSON output shape. (`czkawka_cli/tests/`)
63+
- [ ] **Scan state machine** — replace `scanning`/`processing`/`stop_requested` booleans in `AppState` with an enum-based state machine (idle → scanning → stopping → idle). (`kalka/app/state.py`)
64+
- [ ] **Strongly type settings** — convert `excluded_items`, `allowed_extensions`, `excluded_extensions` from raw strings to lists at the settings layer. (`kalka/app/models.py`, `backend.py`)
65+
66+
### Deep refactors (5+ days each)
67+
68+
- [ ] **Replace `QTreeWidget` with `QTreeView` + model** — implement `QAbstractItemModel` for results, move sorting/selection into model layer. (`kalka/app/results_view.py`)
69+
- [ ] **Reduce CLI command dispatch duplication** — extract a trait/runner abstraction so tool setup, saving, exit handling, and progress wiring are defined once. (`czkawka_cli/src/main.rs`)
70+
- [ ] **Audit and reduce runtime panics** — systematic pass over ~258 `unwrap`/`panic`/`expect` sites in `czkawka_core`, converting user-triggerable ones to typed errors. (`czkawka_core/src/`)
71+
- [ ] **Standardize result metadata** — common serialized envelope across all core tools so consumers don't infer shape per-tool. (`czkawka_core/src/`)

0 commit comments

Comments
 (0)