Skip to content

Upgrade to heap data structures if needed#1132

Draft
stevefan1999-personal wants to merge 3 commits into
smoltcp-rs:mainfrom
stevefan1999-personal:patch-heap-upgrade
Draft

Upgrade to heap data structures if needed#1132
stevefan1999-personal wants to merge 3 commits into
smoltcp-rs:mainfrom
stevefan1999-personal:patch-heap-upgrade

Conversation

@stevefan1999-personal

@stevefan1999-personal stevefan1999-personal commented Mar 26, 2026

Copy link
Copy Markdown

This PR adds a small but slightly noticable change in how the data structures works.

Before this PR, some of the data structures are using compile time configuration settings for predetermined storage sizes for certain data objects, for example, IFACE_MAX_ADDR_COUNT or IFACE_MAX_ROUTE_COUNT.

Now this PR is an attempt to lift this restriction, however this is still in a WIP status, and may need some debate whether this is useful.

Why?

I'm using this for a private VPN library I wrote and somehow I come forward to notice I cannot add more than certain number of routes, then I noticed the compile time constraints limits the options of my network interface HARD.

Since I'm not using this on an embedded environment, having heap allocation is fine.

How to use it?

Enable "smallvec" and/or "smallmap" feature in your smoltcp crate. By default this is disabled, and the use of type alias will make it fallback to heapless::LinearMap and heapless::Vec. Therefore by default, you still have the original behavior, that the maximum size cap during compile time still stay as relevant.

However

If you enable "smallvec" or "smallmap", then the semantic of the Vec and LinearMap changes.

Notably, for "smallvec" feature, the structure now changes from a flat, fixed hard capacity, into still having inlined stack content initially, but if you have more than N items, it will spill over to heap. Previously this will just return the overflown item, but now that means every push will succeed. Space and time complexity is still O(n), but the data storage will transfer from stack to heap. This could potentially be a problem for real time service.

And for "smallmap", the changes may get a little more drastic, for heapless linear map, this is the original description.

A fixed capacity map/dictionary that performs lookups via linear search.

Note that as this map doesn't use hashing so most operations are O(n) instead of O(1).

The design of "smallmap" still contains the heapless linear map at first (that means still stored in stack for the first N items), but now all the content of the original linear map will spill over to a new alloc::collections::BTreeMap. This changes the semantic nature of the "linear map" a little bit more different. Specifically, (self balanced) BST maps maintains O(log n) time complexity for both amortized and worst case, rather than having O(n) for the linear map.

I've chosen to use BST because I noticed the data nature of the what is using the linear map actually has numeric partial orders, despite we should technically use hash tables.

Therefore after some consideration, I've decided to make this opt-in and tries to keep the original behavior as much as possible, only to extend it when it really needs to upgrade to heap, as this is the "best" of both world. You still get to keep your original heapless data structure behavior if you don't want heap.

And if you do need it, you need to explicitly enable it yourself, and be heads up about the semantic change after heap upgrade because you don't know down the line whether this will have some different implications for the characteristic of the TCP/IP stack.

Both feature implies alloc. This means a global heap allocator is required.

ABI change: almost untouched. Only if you have Vec or LinearMap in the public functions, then we do have a little bit of ABI breakage. But the code is intentionally implemented to mimck existing use case of heapless::LinearMap as much as possible, despite some specific optimizations can be employed (for example, return Result<(), ()> rather than Result<(), T> to save the storage, since push is always going to succeed, I don't think we can use Infalliable for this? Indeed!).

But in exchange, all the IFACE_MAX_ADDR_COUNT or IFACE_MAX_ROUTE_COUNT now becomes the stack cache count before spilling over to heap. This theoretically means all the restrictions about the max counts are lifted, but in some places of the code there are still manual checks for the max count and needs some more treatment. For example, some of the struct is using patterns like [T; MAX_COUNT], so these will need refactors too.

Fail cases:

❯ cargo nextest run --all  --no-fail-fast --features smallvec,smallmap
────────────
     Summary [   0.637s] 654 tests run: 652 passed, 2 failed, 0 skipped
        FAIL [   0.008s] smoltcp iface::neighbor::test::test_evict
        FAIL [   0.005s] smoltcp wire::dhcpv4::test::test_parse_ack_dns_servers

*The name of the LinearMap type is still intentionally kept to keep the patchset minimal, unless it is agreed for a full mod rename refactor.

@codecov

codecov Bot commented Mar 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.44%. Comparing base (608a3b9) to head (3e2f191).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1132   +/-   ##
=======================================
  Coverage   81.44%   81.44%           
=======================================
  Files          81       81           
  Lines       24988    24988           
=======================================
  Hits        20352    20352           
  Misses       4636     4636           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@stevefan1999-personal

Copy link
Copy Markdown
Author

Well, regarding fallible for alloc data structures, it's sad to say that I cannot find a way to do fallible insert for BTreeMap. There is simply no way to tell whether we can insert in case we need it, so that means now the insert can panic.

By the way, this is pretty cool: https://docs.rs/scapegoat/latest/scapegoat/ and this basically implements a BST on stack.

@whitequark

whitequark commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

Are the new features additive?

@stevefan1999-personal

Copy link
Copy Markdown
Author

Are the new features additive?

It does not really add new features on its own, but indeed it changed some behaviors that needs to be activated on demand. So I would say it is augmentative rather than additive

@whitequark

Copy link
Copy Markdown
Contributor

I mean the feature flags. If enabling a feature flag can break a dependent that worked previously then this is a flawed design that we can't use.

@stevefan1999-personal

Copy link
Copy Markdown
Author

I mean the feature flags. If enabling a feature flag can break a dependent that worked previously then this is a flawed design that we can't use.

Well, I've already made this conservative enough that both old and new behaviors coexist.

So I would suggest either 1. relax the max limit, and change the wordings to hard cap to suggestive max, then we need to find places where the new design would break that old semantic. This is a more of like solving the problem in a foundational level or 2. accept that this feature is somewhat experimental and it may cause something that used to work broken, despite it is on a more fundamental data structure level, it is not that we changed any implementation details that violates the IETF RFCs, but it is still an experimental switch that lets you go beyond the original max settings, so you are on your own.

So far 99.69% of the tests are passing, and what I've seen is only on the test_evict and test_parse_ack_dns_servers, and it is the ARP caching mechanism that actually looks more like as some kind of FIFO cache, but it uses the nature of the maximum cap of linear map as a convenient way to do front popping eviction. So that might rather need a proper redesign, for example a LRU cache.

And it is not that it is a flawed design. It is about giving you a choice that you are willing to take that something might break, and if you don't enable it you still got the old design anyway.

@stevefan1999-personal

Copy link
Copy Markdown
Author

Hmm, maybe you mean I didn't get any real coverage and unit tests because all them are designed in a way that you have a max cap awareness...if you mean thag, that's a little more troublesome

@whitequark

whitequark commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

And it is not that it is a flawed design. It is about giving you a choice that you are willing to take that something might break, and if you don't enable it you still got the old design anyway.

The problem arises if there are two different crates that both depend on smoltcp. If one of them enables a feature, the feature will be enabled for the other crate (that never asked for it) as well. So enabling a feature has to be a fully backwards-compatible action or you risk ecosystem-breaking implications.

This is what is referred to by the term additive features in Rust.

@stevefan1999-personal

stevefan1999-personal commented Mar 26, 2026

Copy link
Copy Markdown
Author

And it is not that it is a flawed design. It is about giving you a choice that you are willing to take that something might break, and if you don't enable it you still got the old design anyway.

The problem arises if there are two different crates that both depend on smoltcp. If one of them enables a feature, the feature will be enabled for the other crate (that never asked for it) as well. So enabling a feature has to be a fully backwards-compatible action or you risk ecosystem-breaking implications.

This is what is referred to by the term additive features in Rust.

Good call. Then this is more of a inversion of control problem. What about adding a README section to tell user to isolate the crate with this feature as another entry, and make sure it is the user's responsibility to not contaminate peer dependencies by feature creeping.

Of course, a better way to handle this is always add another layer of abstraction that gets pretty complicated.

Either way, I need smoltcp to synchronize through a Kubernetes cluster and the static caps are a problem because I got a panic from insufficient stack, and indeed I need that much routes for the VPN to reconcile...I do get why smoltcp is designed that way because it is more of a replacment for lwip in embedded rather than for some network applications, and that having it stack only makes it friendly to CPU cache which is vital to embedded as well, but so far I switched to heap spill, and ran the local loopback benchmark, it shows no performance regression so far.

@Dirbaio

Dirbaio commented Mar 26, 2026

Copy link
Copy Markdown
Member

I think we should take a step back and design smoltcp's data structures properly. The current situation isn't exactly clean, and this PR would complicate it more.

  • Older parts of smoltcp use managed. It allows choosing between std heap data structures and borrowed slice data structures at runtime. managed pre-dates const generic support in Rust.
  • Some newer parts use heapless. Stores data inline in the struct, max size configurable at compile time.
  • Then this PR makes some configurable between smallvec/smallmap heap data structures at runtime.

If we merge this we'll be using all of std, managed, heapless, smallvec, smallmap. It's too much, it's lacking a coherent vision.

We should decide what are our requirements

  • what do we want?
    • Std heap storage
    • Smallvec/smallmap optimization is needed? maybe not
    • borrowed storage
    • inline storage
  • When to choose?
    • Compile-time
    • Runtime

then implement something that fulfills those requirements, and then update smoltcp to use that consistently.

@whitequark

Copy link
Copy Markdown
Contributor

I think we should take a step back and design smoltcp's data structures properly.

I agree.

My personal view is that we should standardize on managed, since I'm not aware of any reason not to. But this could be just ignorance of some good reasons to do otherwise.

@stevefan1999-personal

Copy link
Copy Markdown
Author

I think we should take a step back and design smoltcp's data structures properly. The current situation isn't exactly clean, and this PR would complicate it more.

  • Older parts of smoltcp use managed. It allows choosing between std heap data structures and borrowed slice data structures at runtime. managed pre-dates const generic support in Rust.
  • Some newer parts use heapless. Stores data inline in the struct, max size configurable at compile time.
  • Then this PR makes some configurable between smallvec/smallmap heap data structures at runtime.

If we merge this we'll be using all of std, managed, heapless, smallvec, smallmap. It's too much, it's lacking a coherent vision.

We should decide what are our requirements

  • what do we want?
    • Std heap storage
    • Smallvec/smallmap optimization is needed? maybe not
    • borrowed storage
    • inline storage
  • When to choose?
    • Compile-time
    • Runtime

then implement something that fulfills those requirements, and then update smoltcp to use that consistently.

Ooh I remembered that managed crate back when I was working in m-labs, was a huge headache for me to work on it back then due to some UBs haha

@whitequark

Copy link
Copy Markdown
Contributor

Ooh I remembered that managed crate back when I was working in m-labs, was a huge headache for me to work on it back then due to some UBs haha

I would be very surprised if managed had any UB in it considering it doesn't have unsafe. That just sounds like the usual M-Labs incompetence.

@Dirbaio

Dirbaio commented Mar 26, 2026

Copy link
Copy Markdown
Member

My personal view is that we should standardize on managed, since I'm not aware of any reason not to. But this could be just ignorance of some good reasons to do otherwise.

I introduced heapless to the codebase becuase using const generics to store things inline is more ergonomic than managed+slices:

  • With slices, the user has to create them all and pass them to Interface::new manually.
  • With slices, the types of the entries have to be public (even if opaque) because the user has to create the slice, with inline storage they can be private.
  • It prevents the user from making a single struct that contains all the smoltcp stuff, since the Interface field has to borrow from the other fields.

Also we have more buffers in Interface than before (for fragmentation for example) so the "make user create all slices" would be even more unergonomic now than it was before.

heapless is not fulfilling all the requirements though. For exmaple it's not possible to use in socket buffers because it's nice to be able to give different buffer sizes to different sockets, and the way SocketSet works it's not possible to pass different const generics to different sockets (and even if we could i'm not sure we should because monomorphization bloat).

@whitequark

Copy link
Copy Markdown
Contributor

Right, and we can't make managed delegate to heapless (without breaking the API) because it doesn't have a generic size parameter.

Should we extend managed to have three options: borrowed, owned, inline? Am I missing a reason why that wouldn't work well?

@stevefan1999-personal

stevefan1999-personal commented Mar 26, 2026

Copy link
Copy Markdown
Author

Ooh I remembered that managed crate back when I was working in m-labs, was a huge headache for me to work on it back then due to some UBs haha

I would be very surprised if managed had any UB in it considering it doesn't have unsafe. That just sounds like the usual M-Labs incompetence.

Oh then probably it is not managed's fault XD but I remember there is a horrible hack in the firmware that used some Rust black magic that is more than transmute to work around some limitations of the managed library, to make smoltcp stack work on a piece of memory from the hardware pretending it is on stack (I checked the kernel code and sadly I couldn't find that anymore). I'm not sure if that is memory mapped or from the FPGA, as I actually don't yet have a clue back how Rust is complicated then.

It is pretty long time now, I don't really remember all the fuzzy details given I have lost all of it due to that SSD disaster...

Speaking about the draft, I think this should deserve a look back into managed...and how it could co-existing with the existing heapless usage that somehow turned into a...tech debt. But anyway, this "fork" somehow works pretty well with my existing horrible use case for a CNI that is technically a VPN for syncing all the pod CIDRs...all in software smoltcp stack. I'm preseving it here as more of a tech demo than a PR now, until we come to an agreement on how to fix the memory management responsibility issue first.

@stevefan1999-personal stevefan1999-personal marked this pull request as draft March 26, 2026 13:00
@Dirbaio

Dirbaio commented Mar 26, 2026

Copy link
Copy Markdown
Member

Should we extend managed to have three options: borrowed, owned, inline? Am I missing a reason why that wouldn't work well?

maybe! however

  • not sure if the enum approach works anymore. as long as the "inline" variant is there you'd have to pass the size const generic even if you don't use it which is weird
  • as long as we offer the possibility of passing in borrowed slices we have to make all constructors take managed::Slice etc from the user which hurts ergonomics

@whitequark

whitequark commented Mar 26, 2026

Copy link
Copy Markdown
Contributor
  • not sure if the enum approach works anymore. as long as the "inline" variant is there you'd have to pass the size const generic even if you don't use it which is weird

Can the size parameter default to 0, or will this break type inference somehow?

  • as long as we offer the possibility of passing in borrowed slices we have to make all constructors take managed::Slice etc from the user which hurts ergonomics

I think this could be resolved by having the end user plug .into() everywhere, which isn't ideal but doesn't seem too bad or out of place for embedded Rust? I'm not sure though.


The other option would be to ditch the fixed-size stuff entirely and instead take a radical approach. What managed is intended to achieve is to let you use either the heap, or the stack as a sort of an ad-hoc, highly manual arena allocator.

What if we change managed to offer two options: allocated in a heap or allocated in an arena, fixed size and (typically) placed on the stack. This would entirely eliminate the problem where you have to do extremely careful ordering of slice allocations in the no-alloc mode to make all the borrows work. The arena would not offer any way to deallocate individual objects; you'd then have a single region, and you'd drop it together with the stack (if ever). There would be no dynamic allocation, just an initialization phase like right now.

@stevefan1999-personal

Copy link
Copy Markdown
Author
  • not sure if the enum approach works anymore. as long as the "inline" variant is there you'd have to pass the size const generic even if you don't use it which is weird

Can the size parameter default to 0, or will this break type inference somehow?

  • as long as we offer the possibility of passing in borrowed slices we have to make all constructors take managed::Slice etc from the user which hurts ergonomics

I think this could be resolved by having the end user plug .into() everywhere, which isn't ideal but doesn't seem too bad or out of place for embedded Rust? I'm not sure though.

The other option would be to ditch the fixed-size stuff entirely and instead take a radical approach. What managed is intended to achieve is to let you use either the heap, or the stack as a sort of an ad-hoc, highly manual arena allocator.

What if we change managed to offer two options: allocated in a heap or allocated in an arena, fixed size and (typically) placed on the stack. This would entirely eliminate the problem where you have to do extremely careful ordering of slice allocations in the no-alloc mode to make all the borrows work. The arena would not offer any way to deallocate individual objects; you'd then have a single region, and you'd drop it together with the stack (if ever). There would be no dynamic allocation, just an initialization phase like right now.

I would like to suggest using HKT, which my instinct suggested to look at https://docs.rs/archery/latest/archery/, but then I realize this is emulated using GAT and the user ergonomics could be even worse.

@Dirbaio

Dirbaio commented Mar 26, 2026

Copy link
Copy Markdown
Member

Can the size parameter default to 0, or will this break type inference somehow?

you can't default generic params :(

The arena would not offer any way to deallocate individual objects; you'd then have a single region, and you'd drop it together with the stack (if ever).

This is annoying if you want to dynamically create/destroy TCP sockets with their buffers. If the socket goes to the arena and allocates the buffer on its own, you'd fill up the arena after a few cycles of creating+destroying tcp sockets. If the user has to allocate manually from the arena and pass it to the socket we don't gain much ergonomics.

@whitequark

whitequark commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

I would like to suggest using HKT, which my instinct suggested to look at https://docs.rs/archery/latest/archery/, but then I realize this is emulated using GAT and the user ergonomics could be even worse.

Not using complicated type magic like this, even at the cost of some verbosity, was one of the founding principles of smoltcp and I don't think we should compromise on this.

This is annoying if you want to dynamically create/destroy TCP sockets with their buffers.

If you want that you should enable alloc. The point of having managed was to make sure that certain firmware with basic TCP/IP needs (but a high requirement for reliability) can still use smoltcp. It was never intended to cover to something so subtle as dynamic socket allocation; so I don't see it as a problem that it can't handle that.

@Dirbaio

Dirbaio commented Mar 26, 2026

Copy link
Copy Markdown
Member

By "dynamic" I mean something like this https://github.com/embassy-rs/embassy/blob/75e1d9a48fd562cf47ee729898b1d8b19ec5d9d5/tests/perf-client/src/lib.rs#L123-L125

when that function starts it creates the buffers+socket. When it returns the buffers and socket are destroyed. It's still "statically allocated" because it's a local variable which gets transformed into a field in the future by Rust's async transform but it's still "dynamic" as in any corner of the program can create and use a socket, I don't have to centrally create all buffers upfront and distribute them to every corner of my prorgram.

@whitequark

Copy link
Copy Markdown
Contributor

How does that currently work? Don't the socket buffers have to outlive the stack to be added into it?

@Dirbaio

Dirbaio commented Mar 26, 2026

Copy link
Copy Markdown
Member

The socket borrows the buffers, so you're forced to drop the socket before deallocating the buffers. Socket drop removes it from the SocketSet.

(Yes, it's not forget-safe, like some other things in Embassy. It's basically the same problem as safe DMA futures)

@whitequark

Copy link
Copy Markdown
Contributor

Okay, right, so it's unsound (something you're willing to have in Embassy but I'm not okay having in smoltcp) and the new heap data structures could maybe make it sound if they're correctly designed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants