Skip to content

Commit 104e82a

Browse files
committed
blobby: add encode and decode bin utils
1 parent f5834a7 commit 104e82a

File tree

7 files changed

+269
-139
lines changed

7 files changed

+269
-139
lines changed

Cargo.lock

Lines changed: 24 additions & 27 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

blobby/Cargo.toml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,4 @@ repository = "https://github.com/RustCrypto/utils"
99
categories = ["no-std"]
1010
edition = "2024"
1111
rust-version = "1.85"
12-
13-
[dev-dependencies]
14-
hex = "0.4"
12+
readme = "README.md"

blobby/README.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# [RustCrypto]: Blobby
2+
3+
[![crate][crate-image]][crate-link]
4+
[![Docs][docs-image]][docs-link]
5+
[![Build Status][build-image]][build-link]
6+
![Apache2/MIT licensed][license-image]
7+
![Rust Version][rustc-image]
8+
[![Project Chat][chat-image]][chat-link]
9+
10+
Iterators over a simple binary blob storage.
11+
12+
## Examples
13+
```
14+
let buf = b"\x02\x05hello\x06world!\x01\x02 \x00\x03\x06:::\x03\x01\x00";
15+
let mut v = blobby::BlobIterator::new(buf).unwrap();
16+
assert_eq!(v.next(), Some(Ok(&b"hello"[..])));
17+
assert_eq!(v.next(), Some(Ok(&b" "[..])));
18+
assert_eq!(v.next(), Some(Ok(&b""[..])));
19+
assert_eq!(v.next(), Some(Ok(&b"world!"[..])));
20+
assert_eq!(v.next(), Some(Ok(&b":::"[..])));
21+
assert_eq!(v.next(), Some(Ok(&b"world!"[..])));
22+
assert_eq!(v.next(), Some(Ok(&b"hello"[..])));
23+
assert_eq!(v.next(), Some(Ok(&b""[..])));
24+
assert_eq!(v.next(), None);
25+
26+
let mut v = blobby::Blob2Iterator::new(buf).unwrap();
27+
assert_eq!(v.next(), Some(Ok([&b"hello"[..], b" "])));
28+
assert_eq!(v.next(), Some(Ok([&b""[..], b"world!"])));
29+
assert_eq!(v.next(), Some(Ok([&b":::"[..], b"world!"])));
30+
assert_eq!(v.next(), Some(Ok([&b"hello"[..], b""])));
31+
assert_eq!(v.next(), None);
32+
33+
let mut v = blobby::Blob4Iterator::new(buf).unwrap();
34+
assert_eq!(v.next(), Some(Ok([&b"hello"[..], b" ", b"", b"world!"])));
35+
assert_eq!(v.next(), Some(Ok([&b":::"[..], b"world!", b"hello", b""])));
36+
assert_eq!(v.next(), None);
37+
```
38+
39+
## Encoding and decoding
40+
41+
This crate provides encoding and decoding utilities for converting between
42+
the blobby format and text file with hex-encoded strings.
43+
44+
Let's say we have the following test vectors for a 64-bit hash function:
45+
```text
46+
0123456789ABCDEF0123456789ABCDEF
47+
217777950848CECD
48+
49+
F7CD1446C9161C0A
50+
FFFEFD
51+
80081C35AA43F640
52+
53+
```
54+
The first, third, and fifth lines are hex-encoded hash inputs, while the second,
55+
fourth, and sixth lines are hex-encoded hash outputs for input on the previous line.
56+
Note that the file should contain a trailing empty line (i.e. every data line should end
57+
with `\n`).
58+
59+
We can encode this file into the Blobby format by running the following command:
60+
```sh
61+
cargo run --releae --bin encode -- /path/to/input.txt /path/to/output.blb
62+
```
63+
64+
This will create a file which then can be read using `blobby::Blob2Iterator`.
65+
66+
To see contents of a Blobby file you can use the following command:
67+
```sh
68+
cargo run --releae --bin decode -- /path/to/input.blb /path/to/output.txt
69+
```
70+
The output file will contain a sequence of hex-encoded byte strings stored
71+
in the input file.
72+
73+
## Storage format
74+
75+
Storage format represents a sequence of binary blobs. The format uses
76+
git-flavored [variable-length quantity][0] (VLQ) for encoding unsigned
77+
numbers.
78+
79+
File starts with a number of de-duplicated blobs `d`. It followed by `d`
80+
entries. Each entry starts with an integer `m`, immediately folowed by `m`
81+
bytes representing de-duplicated binary blob.
82+
83+
Next follows unspecified number of entries representing sequence of stored
84+
blobs. Each entry starts with an unsigned integer `n`. The least significant
85+
bit of this integer is used as a flag. If the flag is equal to 0, then the
86+
number is followed by `n >> 1` bytes, representing a stored binary blob.
87+
Otherwise the entry references a de-duplicated entry number `n >> 1`.
88+
89+
[0]: https://en.wikipedia.org/wiki/Variable-length_quantity
90+
91+
## License
92+
93+
Licensed under either of:
94+
95+
* [Apache License, Version 2.0](http://www.apache.org/licenses/LICENSE-2.0)
96+
* [MIT license](http://opensource.org/licenses/MIT)
97+
98+
at your option.
99+
100+
### Contribution
101+
102+
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
103+
104+
[//]: # (badges)
105+
106+
[crate-image]: https://img.shields.io/crates/v/blobby.svg
107+
[crate-link]: https://crates.io/crates/blobby
108+
[docs-image]: https://docs.rs/blobby/badge.svg
109+
[docs-link]: https://docs.rs/blobby/
110+
[license-image]: https://img.shields.io/badge/license-Apache2.0/MIT-blue.svg
111+
[rustc-image]: https://img.shields.io/badge/rustc-1.85+-blue.svg
112+
[chat-image]: https://img.shields.io/badge/zulip-join_chat-blue.svg
113+
[chat-link]: https://rustcrypto.zulipchat.com/#narrow/stream/260052-utils
114+
[build-image]: https://github.com/RustCrypto/utils/actions/workflows/blobby.yml/badge.svg?branch=master
115+
[build-link]: https://github.com/RustCrypto/utils/actions/workflows/blobby.yml?query=branch:master
116+
117+
[//]: # (general links)
118+
119+
[RustCrypto]: https://github.com/rustcrypto

blobby/examples/convert.rs

Lines changed: 0 additions & 64 deletions
This file was deleted.

blobby/src/bin/decode.rs

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
//! Encoding utility
2+
use blobby::BlobIterator;
3+
use std::io::{self, BufRead, BufReader, BufWriter, Write};
4+
use std::{env, error::Error, fs::File};
5+
6+
fn encode_hex(data: &[u8]) -> String {
7+
let mut res = String::with_capacity(2 * data.len());
8+
for &byte in data {
9+
res.push_str(&format!("{byte:02X}"));
10+
}
11+
res
12+
}
13+
14+
fn decode<R: BufRead, W: Write>(mut reader: R, mut writer: W) -> io::Result<usize> {
15+
let mut data = Vec::new();
16+
reader.read_to_end(&mut data)?;
17+
let res = BlobIterator::new(&data)
18+
.map_err(|e| {
19+
io::Error::new(
20+
io::ErrorKind::InvalidData,
21+
format!("invalid blobby data: {:?}", e),
22+
)
23+
})?
24+
.collect::<Vec<_>>();
25+
for blob in res.iter() {
26+
let blob = blob.map_err(|e| {
27+
io::Error::new(
28+
io::ErrorKind::InvalidData,
29+
format!("invalid blobby data: {:?}", e),
30+
)
31+
})?;
32+
writer.write_all(encode_hex(blob).as_bytes())?;
33+
writer.write_all(b"\n")?;
34+
}
35+
Ok(res.len())
36+
}
37+
38+
fn main() -> Result<(), Box<dyn Error>> {
39+
let args: Vec<String> = env::args().skip(1).collect();
40+
41+
if args.is_empty() {
42+
println!(
43+
"Blobby decoding utility.\n\
44+
Usage: decode <input blb file> <output text file>"
45+
);
46+
return Ok(());
47+
}
48+
49+
let in_path = args[0].as_str();
50+
let out_path = args[1].as_str();
51+
let in_file = BufReader::new(File::open(in_path)?);
52+
let out_file = BufWriter::new(File::create(out_path)?);
53+
54+
let n = decode(in_file, out_file)?;
55+
println!("Processed {n} record(s)");
56+
57+
Ok(())
58+
}

0 commit comments

Comments
 (0)