Skip to content

Commit d00ecdc

Browse files
feat: Add CLI tools for ORC file inspection and manipulation (#73)
* feat: Add CLI tools for ORC file inspection and manipulation * refactor: Consolidate ORC CLI commands into a unified tool This commit merges multiple ORC CLI commands into a single command structure, enhancing usability and maintainability. The previous commands for metadata inspection, data export, and statistics have been integrated into a cohesive CLI tool with subcommands for various functionalities, including `info`, `export`, `stats`, `layout`, and `index`. Additionally, the `orc` binary has been streamlined to facilitate easier command execution. * Enhance CLI testing framework for ORC binary This commit expands the testing suite for the unified `orc` CLI binary, adding comprehensive tests for various subcommands including `info`, `export`, `stats`, `layout`, and `index`. It introduces helper functions for managing test data paths and expected output comparisons, ensuring that actual command outputs are validated against predefined expected results. Additionally, new expected output files have been created to support these tests, improving the robustness of the CLI tool's testing framework. * docs: Add README for ORC CLI Tool with usage instructions and command details * feat: Add bloom filter inspection command to ORC CLI * feat: Add test for bloom filter's might_contain functionality and expected output * refactor: Remove obsolete help output tests for ORC CLI commands * fix fmt * fix test_bloom_might_contain_true * fix: omit to export all * feat: fix bug and add CSV export test for specific columns * feat: add validation for unknown columns in export
1 parent 46de7f0 commit d00ecdc

28 files changed

Lines changed: 2694 additions & 256 deletions

Cargo.toml

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,8 @@ tokio = { version = "1.28", optional = true, features = [
6464
# cli
6565
anyhow = { version = "1.0", optional = true }
6666
clap = { version = "4.5.4", features = ["derive"], optional = true }
67+
serde = { version = "1.0", features = ["derive"], optional = true }
68+
serde_json = { version = "1.0", default-features = false, features = ["std"], optional = true }
6769

6870
# opendal
6971
opendal = { version = "0.53", optional = true, default-features = false }
@@ -75,13 +77,12 @@ criterion = { version = "0.5", default-features = false, features = ["async_toki
7577
opendal = { version = "0.53", default-features = false, features = ["services-memory"] }
7678
pretty_assertions = "1.3.0"
7779
proptest = "1.0.0"
78-
serde_json = { version = "1.0", default-features = false, features = ["std"] }
7980

8081
[features]
8182
default = ["async"]
8283

8384
async = ["async-trait", "futures", "futures-util", "tokio"]
84-
cli = ["anyhow", "clap"]
85+
cli = ["anyhow", "clap", "serde", "serde_json"]
8586
# Enable opendal support.
8687
opendal = ["dep:opendal"]
8788

@@ -96,13 +97,6 @@ path = "./benches/arrow_reader.rs"
9697
debug = true
9798

9899
[[bin]]
99-
name = "orc-metadata"
100-
required-features = ["cli"]
101-
102-
[[bin]]
103-
name = "orc-export"
104-
required-features = ["cli"]
105-
106-
[[bin]]
107-
name = "orc-stats"
100+
name = "orc"
101+
path = "src/bin/orc/main.rs"
108102
required-features = ["cli"]

src/bin/orc-export.rs

Lines changed: 0 additions & 151 deletions
This file was deleted.

src/bin/orc-metadata.rs

Lines changed: 0 additions & 77 deletions
This file was deleted.

0 commit comments

Comments
 (0)