-
Notifications
You must be signed in to change notification settings - Fork 1
Description
In Ruby, the Symbol type is a garbage-collected string type that is allocated once for each unique string. It would be useful to offer similar functionality to pco_store for string fields which are highly duplicated across the rows being stored.
There are a lot of string interning crates available for Rust with different APIs, so it's nontrivial to decide which we should support.
Requirements:
- must be globally allocated so a state variable doesn't need to be passed around
- must be garbage collected when no longer referenced
- should have a convenient API so they're similar in usage to
&strandString - serde must be supported
In terms of serialization, is there one that internally implements a serialization format that removes duplicates from e.g. a Vec<String> to avoid the large initial allocation when deserializing, or is that something we have to implement ourselves?
We should benchmark the different approaches (both CPU and RAM) to compare the performance improvements versus the added complexity.