Description
Hi! This is related to the zerovec
crate. It's a great library, thank you for also publishing it for others outside of this project to use.
However, for my use case I would like to use a ZeroMap that handles something larger than 64 KB (namely up to around 1 MB), which, if I understand the code correctly, is around the maximum a ZeroMap can handle due to it using Index16 from the underlying VarZeroVec.
The following minimum example shows such a failure:
use zerovec::ZeroMap;
fn main() {
let values: [u32; 9] = [24285424, 18424230, 2428314, 48022448, 480224482, 242911084, 498149810, 189138108, 381084184];
let len = 4000;
let mut zmap: ZeroMap<'_, str, [u8]> = ZeroMap::with_capacity(len);
for i in 0..len {
let mut value_vec = Vec::with_capacity(10);
for j in 0..15 {
let v = values[j % 9];
let vu = (v % 255) as u8;
value_vec.push(vu)
}
let k: String = format!("{}.{}", i, values[i % 9]);
zmap.insert(&k, &value_vec);
}
}
This panics with:
...icu4x/utils/zerovec/src/varzerovec/owned.rs:282:17:
Attempted to grow VarZeroVec to an encoded size that does not fit within the length size used by zerovec::varzerovec::components::Index16
This was quite unexpected when this first happened, as it's not directly documented in ZeroMap. It is documented in the description of the F parameter for VarZeroVec, but I do think it might be useful to make add a bit more of an explicit warning for those who are not as familiar with these kinds of issues.
Now I saw issue #2312, so I tried to, as a last-effort fix, implement ZeroMapKV for newtypes of [u8] and str, but I only got as far as:
#[derive(Debug, VarULE)]
#[repr(transparent)]
struct BytesKVIndex32([u8]);
impl<'a> ZeroMapKV<'a> for BytesKVIndex32 {
type Container = VarZeroVec<'a, BytesKVIndex32, Index32>;
type Slice = VarZeroSlice<BytesKVIndex32, Index32>;
type GetType = BytesKVIndex32;
type OwnedType = Box<BytesKVIndex32>;
}
#[derive(VarULE, Debug)]
#[repr(transparent)]
struct StrKVIndex32(str);
impl<'a> ZeroMapKV<'a> for StrKVIndex32 {
type Container = VarZeroVec<'a, StrKVIndex32, Index32>;
type Slice = VarZeroSlice<StrKVIndex32, Index32>;
type GetType = StrKVIndex32;
type OwnedType = Box<StrKVIndex32>;
}
When then trying to use that like this:
#[derive(serde::Serialize, serde::Deserialize, Debug)]
struct Data<'a> {
#[serde(borrow)]
map: ZeroMap<'a, StrKVIndex32, BytesKVIndex32>,
}
I get a bunch of errors related to Deserialize/Serialize not being implemented, but I can't easily derive them on the StrKVIndex32/BytesKVIndex32 either, as it says it doesn't work for unsized types. I will admit at this point it's somewhat beyond my current skill level, and I'd rather not dig into much deeper without knowing for sure that it's even possible.
So my main question is: could you provide some guidance/documentation on how to construct larger ZeroMaps? In the best case of course #2312 would be resolved, but I understand if there is no bandwidth for that.
Also some documentation/warning on its limitations directly in the ZeroMap docs would be appreciated as well.
If there is anything I can do to help, let me know.