experiment(value): evaluate flat ObjectMap backends by lukesteensen · Pull Request #1826 · vectordotdev/vrl

lukesteensen · 2026-06-17T21:48:47Z

Summary

This is an experimental PR evaluating alternative ObjectMap storage layouts.

The goal is to improve memory locality and clone/allocation efficiency for VRL
object values. Today ObjectMap is effectively BTree-backed, which gives good
lookup behavior but poor locality and expensive structural clones. This branch
adds enum-backed ObjectMap variants so we can compare the current BTree layout
against flatter vector-backed layouts.

The enum is mostly experimental scaffolding. It lets us compare designs, but it
also adds cost through larger object size and extra branching. If we identify a
winning representation, implementing that representation directly should be
better than keeping enum dispatch in the hot path.

What changed

Added enum-backed ObjectMap storage variants.
Added flat/vector-backed ObjectMap experiments.
Moved many call sites away from assuming ObjectMap is a BTreeMap.
Added targeted ObjectMap benchmarks:
- objectmap
- objectmap_cliff
- objectmap_hybrid
Added independent KeyString construction cleanup and benchmarks.

Benchmark takeaways

The isolated benchmarks show the expected tradeoff:

Flat maps have a clear width cliff for isolated lookup/update operations.
Hit lookups are only competitive at very small widths.
Miss lookups cross over around ~128 fields.
Building wide flat maps from scratch is worse due to repeated linear scans.

However, the more realistic cloned-event benchmarks are much more promising:

Flat storage benefits from cheaper clones and better memory locality.
objectmap_cliff/realistic_event favors flat maps from roughly width 16 onward.
objectmap_cliff/realistic_event_readonly strongly favors flat maps because
clones remain cheap and no mutation forces extra work.

So the experiment suggests the core hypothesis is valid: improving memory
locality and clone efficiency can outweigh worse isolated lookup behavior in
realistic VRL/event workloads.

KeyString changes

This branch also includes independent KeyString cleanup.

Several call sites were constructing temporary Strings only to immediately
convert them into KeyString. With today’s String-backed KeyString, the
impact is small because LLVM can often optimize the extra work away. The cleanup
becomes more important if we later move KeyString to an SSO/refcounted backing
type, where direct construction from &str/Cow<str> can avoid heap allocation.

These changes are separable from the ObjectMap experiment, but came out of the
same performance investigation.

API compatibility

This is technically a breaking change.

ObjectMap / Value::Object have exposed BTreeMap-specific behavior and
construction patterns publicly. External code that constructs, destructures, or
uses Value::Object as a BTreeMap will need to move to ObjectMap APIs
instead.

This is part of the experiment: hiding the backing map type is necessary if we
want freedom to change the representation.

Validation

Local checks run:

cargo fmt --check
cargo test --features 'default test'
cargo bench --features 'default test' --bench objectmap_cliff --no-run
cargo bench --features 'default test' --bench objectmap_hybrid --no-run
cargo bench --features 'default test' --bench objectmap_cliff -- --noplot

Open questions

Have we sufficiently captured workloads that are adversarial to this design,
such that we would have confidence turning it on by default?
Do we need to maintain the enum-backed implementation for optionality / opt-in
despite its cost, or should we switch to a single winning representation once
the design is chosen?

Many call sites were creating a String allocation only to immediately convert it to KeyString. This is wasteful because KeyString::from(&str) and KeyString::from(Cow<str>) can construct directly without the intermediate heap allocation. The most impactful fix is in crud/insert.rs where field.to_string().into() was called on every VRL path insertion — field is Cow<str> (usually Borrowed), so this was allocating on every event for no reason. Also fixes ~20 other sites across stdlib parsers (parse_syslog, parse_grok, parse_key_value, flatten, unflatten, tally, etc.) and the value! macro where literal strings went through String::from() first. These changes are independently valuable with String-backed KeyString (saves one allocation per conversion) and become even more important with SSO string types where the &str path can inline short strings with zero allocation.

Four benchmark groups for comparing KeyString backing types: - keystring_micro: construction, clone, roundtrip costs - path_ops: owned vs JIT path traversal - vrl_programs: end-to-end VRL execution (remap, parse_syslog, flatten, object construction) - json_deser: serde_json::Value intermediate vs direct deserialization

Rename from_long_str/clone_long to from_medium_str/clone_medium (22B, within CompactString inline but spills EcoString). Add true from_long_str/clone_long at 31B that spills both SSO types for a fair heap-allocated comparison.

lukesteensen and others added 8 commits June 17, 2026 15:44

WIP: ObjectMap multi-backend with Flat/BumpFlat variants

07f360f

Add spilled-key benchmarks for fair SSO comparison

0b7d4bd

Rename from_long_str/clone_long to from_medium_str/clone_medium (22B, within CompactString inline but spills EcoString). Add true from_long_str/clone_long at 31B that spills both SSO types for a fair heap-allocated comparison.

Add ObjectMap cliff and hybrid benchmarks

7a11781

Fix ObjectMap test cleanup after rebase

e95bae0

Preserve ObjectMap flat backend key ordering

d243198

fix(value): clean up ObjectMap experiment CI

90933ef

github-actions Bot added the docs review on hold PR is pending a docs team review label Jun 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

experiment(value): evaluate flat ObjectMap backends#1826

experiment(value): evaluate flat ObjectMap backends#1826
lukesteensen wants to merge 8 commits into
vectordotdev:mainfrom
lukesteensen:experiment/objectmap-backends

lukesteensen commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lukesteensen commented Jun 17, 2026

Summary

What changed

Benchmark takeaways

KeyString changes

API compatibility

Validation

Open questions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant