|
| 1 | +--- |
| 2 | +name: ewfinfo parity port |
| 3 | +overview: Implement a Rust `ewfinfo` CLI (clap) + supporting library report/printer APIs in `crates/ewf` that match libewf’s `ewfinfo` image-metadata behavior/output (text + DFXML), with explicit TODO/`unimplemented` for any unsupported surface area (no silent fallbacks). Keep logical file outputs (`-F`/`-H`/`-B`) in the `ewfinfo` binary target (not the library); use `miette` for application-facing diagnostics while keeping library errors in `thiserror`. |
| 4 | +todos: |
| 5 | + - id: ewfinfo-api |
| 6 | + content: Add documented `crates/ewf::ewfinfo` library module for image-metadata reports + printers (no LEF/-F/-H/-B). |
| 7 | + status: completed |
| 8 | + - id: docs-and-unit-tests |
| 9 | + content: Add module docs + rustdoc examples + unit tests for every new public API (include “References” with upstream source file paths). |
| 10 | + status: completed |
| 11 | + - id: ewf1-metadata |
| 12 | + content: Implement EWF1 metadata extraction for header values, media/ewf info, digest hashes, sessions/tracks, and acquisition errors. |
| 13 | + status: completed |
| 14 | + - id: ewf2-metadata |
| 15 | + content: Implement EWF2 metadata extraction (device/case tags, set-id, compression, md5/sha1 sections, etc.). |
| 16 | + status: completed |
| 17 | + - id: ewfinfo-logical-cli |
| 18 | + content: Implement `ewfinfo` CLI-only logical evidence outputs (`-F`/`-H`/`-B`) in the `ewfinfo` binary target (may extend `LefReader` public API minimally, but keep formatting + bodyfile semantics out of the library). |
| 19 | + status: completed |
| 20 | + - id: printers |
| 21 | + content: Implement text + DFXML printers for the image-metadata report that match libewf `ewfinfo` formatting. |
| 22 | + status: completed |
| 23 | + - id: cli-ewfinfo |
| 24 | + content: Add `ewfinfo` binary target using clap for libewf-compatible flags/conflicts and miette for user-facing errors (binary can be multi-file). |
| 25 | + status: completed |
| 26 | + - id: golden-tests |
| 27 | + content: Add golden-output tests for text/dfxml/file-entry/hierarchy/bodyfile, plus TODO/unimplemented tests for unsupported paths. |
| 28 | + status: completed |
| 29 | +--- |
| 30 | + |
| 31 | +# Port libewf `ewfinfo` to `crates/ewf` |
| 32 | + |
| 33 | +## Goal |
| 34 | + |
| 35 | +- Add a **new `ewfinfo` Rust binary** (in `crates/ewf`) and the supporting **public library APIs** so we can reproduce libewf `ewfinfo` feature-for-feature: |
| 36 | +- Options: `-A -B -d -e -f -F -H -i -m -s -v -V -h` (per `external/libewf/manuals/ewfinfo.1` + `external/libewf/ewftools/ewfinfo.c`). |
| 37 | +- Output: **text** and **DFXML** with the same section structure + formatting. |
| 38 | +- **No best-effort fallbacks**: if something isn’t implemented in Rust, we leave an explicit `TODO:` and return `unimplemented!()` / `Error::Unsupported("TODO: ...")` rather than silently degrading. |
| 39 | + |
| 40 | +## Approach (map C ewfinfo → Rust) |
| 41 | + |
| 42 | +### 1) Create a reusable ewfinfo library module (image metadata only) |
| 43 | + |
| 44 | +- Add a new **library** module under `crates/ewf/src/ewfinfo/` that provides a Rust-native “report + printer” API for **image metadata only**. |
| 45 | +- **Do not** 1:1 port or “mirror” libewf’s `info_handle_t`. The `ewfinfo` **binary target** should own the clap-facing types and translate them into a small, strongly-typed library API. |
| 46 | +- Keep the boundary sharp: |
| 47 | +- **Library (`crates/ewf`)**: build a structured report for EWF *image* metadata + print it (text/DFXML). |
| 48 | +- **Binary target (`ewfinfo`)**: owns **logical evidence** modes and outputs (`-F`/`-H`/`-B`), path separator handling, and any bodyfile semantics. |
| 49 | +- Proposed (public) library surface (names TBD; document every `pub` item): |
| 50 | +- `EwfInfoReport`: data model for the sections libewf prints for images: |
| 51 | +- Acquisition/header values (libewf title: “Acquiry information”) |
| 52 | +- EWF information |
| 53 | +- Media information |
| 54 | +- Digest hash information |
| 55 | +- Sessions / Tracks |
| 56 | +- Acquisition read errors |
| 57 | +- `EwfInfoPrinter` (trait) + concrete printers (e.g. `TextPrinter`, `DfxmlPrinter`) with `EwfInfoPrintOptions` (date formatting, verbosity, etc.) |
| 58 | +- `EwfInfoBuildOptions` for report construction knobs that actually affect parsing/normalization (e.g. header decoding/codepage), **not** CLI-only options like `-s`/`-B`. |
| 59 | +- `EwfInfoError` (library) implemented with `thiserror`. |
| 60 | +- **Module documentation requirements** (non-negotiable): |
| 61 | +- Each new module gets `//!` docs with a short compatibility statement and a “References” section that attributes upstream reference material by file path (at minimum): |
| 62 | +- `external/libewf/ewftools/info_handle.h` |
| 63 | +- `external/libewf/ewftools/ewfinfo.c` |
| 64 | +- `external/libewf/manuals/ewfinfo.1` |
| 65 | +- Include rustdoc examples (doctests) that exercise the public API surface (using existing small fixtures/builders). |
| 66 | + |
| 67 | +### 2) Extend readers to expose the metadata ewfinfo prints |
| 68 | + |
| 69 | +Keep the existing small summary API (`EwfInfo` in [`crates/ewf/src/info.rs`](crates/ewf/src/info.rs)) stable; add *new* APIs instead. |
| 70 | + |
| 71 | +#### Disk images (`EwfReader`) |
| 72 | + |
| 73 | +- Add `EwfReader::ewfinfo_report(&self, opts: &EwfInfoBuildOptions) -> Result<EwfInfoReport, EwfInfoError>`. |
| 74 | +- Implement format-specific extraction: |
| 75 | +- **EWF1 (E01/S01)**: parse required sections from the already-discovered section descriptors (header/header2/volume/disk/data/hash/digest/error/session/track). |
| 76 | +- Header values: parse both `header` (ASCII/codepage) and `header2` (UTF-16LE) and construct the same identifier→description mapping used by `info_handle_header_values_fprint`. |
| 77 | +- EWF + media info fields: derive from parsed volume/disk/data structures (`sectors_per_chunk`, `bytes_per_sector`, `number_of_sectors`, `error_granularity`, `set_identifier`, compression level/method). |
| 78 | +- Hash values: read stored global hashes from digest/hash sections (no recomputation unless ewfinfo does so). |
| 79 | +- Sessions/tracks: parse ranges as start_sector/sector_count. |
| 80 | +- Acquisition errors: parse ranges as start_sector/sector_count. |
| 81 | +- **EWF2 (Ex01)**: reuse existing parsing in [`crates/ewf/src/reader.rs`](crates/ewf/src/reader.rs) (case data/device information tags) to populate the same report fields: |
| 82 | +- `set_id`, `compression_method`, `chunk_count`, `sectors_per_chunk`, `bytes_per_sector`, `number_of_sectors` |
| 83 | +- global MD5/SHA1 sections (parse from section types) to populate digest hash info |
| 84 | +- sessions/tracks/errors: if format doesn’t carry them, report 0 entries (matching libewf behavior for “none present”). |
| 85 | + |
| 86 | +### 3) Implement printers for exact text + DFXML output |
| 87 | + |
| 88 | +- Add printer modules under `crates/ewf/src/ewfinfo/`: |
| 89 | +- Text printer replicating: |
| 90 | +- section headers/footers and indentation |
| 91 | +- field label padding (the C code aligns to 24 columns) |
| 92 | +- exact section titles: “Acquiry information”, “EWF information”, “Media information”, “Digest hash information”, etc. |
| 93 | +- DFXML printer replicating the XML emitted by `info_handle_dfxml_*_fprint` (header/footer + element names). |
| 94 | +- **No fallback behavior**: invalid inputs/options should be rejected early. For the **CLI**, clap should enforce as much as possible (enums, conflicts, defaults). For the **library**, return explicit `EwfInfoError::Unsupported("TODO: …")` where needed rather than silently defaulting. |
| 95 | + |
| 96 | +### 4) Add the `ewfinfo` binary target (clap + miette) and keep logical outputs there |
| 97 | + |
| 98 | +- Implement `ewfinfo` as a **binary target** that can be split across multiple Rust modules (prefer directory-style bin: `crates/ewf/src/bin/ewfinfo/main.rs` + submodules). |
| 99 | +- Use **clap** to translate libewf flags/idioms into a typed CLI surface (instead of manually porting structs): |
| 100 | +- `#[derive(Parser)] `root + `Args`/`Subcommand` as needed. |
| 101 | +- `ValueEnum` / typed enums for `-f` (text/dfxml), `-d` (date format), etc. |
| 102 | +- conflict groups for `-e`/`-i`/`-m` (mutually exclusive), and for logical modes (`-F` vs `-H` etc.) as required. |
| 103 | +- rely on clap’s generated `--help`/`--version` UX while keeping short flags compatible. |
| 104 | +- Use **miette** for user-facing diagnostics: |
| 105 | +- Map library `thiserror` errors into `miette::Diagnostic` at the application boundary with helpful context (`wrap_err`, filenames, option values). |
| 106 | +- Keep **logical evidence outputs** out of the library: |
| 107 | +- `-F` (file entry detail), `-H` (hierarchy), `-B` (bodyfile) live in the `ewfinfo` binary target. |
| 108 | +- If the binary needs additional LEF accessors, add small, generic `pub` APIs to `LefReader` (document + unit test them), but keep formatting and bodyfile semantics in the binary. |
| 109 | + |
| 110 | +### 5) Tests + documentation (unit tests first, then golden outputs) |
| 111 | + |
| 112 | +- Add **unit tests** for every new library type/module under `crates/ewf/src/ewfinfo/` (and any new public reader accessors): |
| 113 | +- parsing/normalization invariants |
| 114 | +- section ordering and required fields presence |
| 115 | +- printer formatting invariants (labels, indentation, titles) |
| 116 | +- Add **CLI unit tests** (clap `try_parse_from`) for flag conflicts/defaults and for mapping from CLI types → library options. |
| 117 | +- Add deterministic **golden-output integration tests** in `crates/ewf/tests/` that: |
| 118 | +- generate small synthetic E01/Ex01/L01/Lx01 fixtures using existing writer/test helpers |
| 119 | +- run the Rust `ewfinfo` binary (via `std::process::Command`) and compare stdout to committed golden files for: |
| 120 | +- default text |
| 121 | +- `-f dfxml` |
| 122 | +- `-F` and `-H` |
| 123 | +- `-B` bodyfile output |
| 124 | +- For any feature we haven’t implemented yet (e.g., extended attributes/access control entries if present in real-world files), add a test that asserts we fail with an **explicit TODO/unimplemented** marker. |
| 125 | + |
| 126 | +## Files most likely to change |
| 127 | + |
| 128 | +- [`crates/ewf/src/lib.rs`](crates/ewf/src/lib.rs) (export new ewfinfo APIs) |
| 129 | +- [`crates/ewf/src/info.rs`](crates/ewf/src/info.rs) (keep as-is; add new full-metadata types elsewhere) |
| 130 | +- [`crates/ewf/src/reader.rs`](crates/ewf/src/reader.rs) (expose/retain parsed metadata needed for ewfinfo) |
| 131 | +- New: [`crates/ewf/src/ewfinfo/mod.rs`](crates/ewf/src/ewfinfo/mod.rs) |
| 132 | +- New: [`crates/ewf/src/ewfinfo/print_text.rs`](crates/ewf/src/ewfinfo/print_text.rs) |
| 133 | +- New: [`crates/ewf/src/ewfinfo/print_dfxml.rs`](crates/ewf/src/ewfinfo/print_dfxml.rs) |
| 134 | +- New (preferred): `crates/ewf/src/bin/ewfinfo/` (binary crate modules, e.g. `main.rs`, `cli.rs`, `image.rs`, `logical.rs`, `bodyfile.rs`) |
| 135 | + |
| 136 | +## Notes / constraints |
| 137 | + |
| 138 | +- We’ll use libewf’s behavior/spec as reference but implement logic natively in Rust; no “silent compatibility” shims. |
| 139 | +- Any missing surface area is left as `TODO:` + explicit `unimplemented`/`Unsupported` error (per your requirement). |
| 140 | +- Error policy: `thiserror` in the library; `miette` at the application boundary for pretty CLI diagnostics. |
0 commit comments