Skip to content

Add TransportKind option to channels and channel builder#820

Draft
eloff wants to merge 1 commit intomainfrom
dan/fg-13699-add-way-to-configure-reliable-or-lossy-transport-on-a-per
Draft

Add TransportKind option to channels and channel builder#820
eloff wants to merge 1 commit intomainfrom
dan/fg-13699-add-way-to-configure-reliable-or-lossy-transport-on-a-per

Conversation

@eloff
Copy link
Contributor

@eloff eloff commented Jan 15, 2026

Changelog

None

Docs

None

Description

Adds a TransportKind option to channels that lets users decide between lossy and reliable transport on a per-channel basis. The default is Lossy, but will only affect the new CloudSink. This is not a full QoS solution by any means, but meant to allow users to override the Lossy default for channels that are critically important. Most users shouldn't need to touch this.

I've only added Rust support, if we agree with the direction I'll add Python and C++ before marking it ready for review.

@linear
Copy link

linear bot commented Jan 15, 2026

Copy link
Contributor

@ericmlujan ericmlujan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a question about wrapping this enum in an outer TransportOptions struct to give us some flexibility in the future, wanted to get your thoughts on that

/// not currently [`WebSocketServer`](crate::WebSocketServer) or [`McapWriter`](crate::McapWriter).
/// Messages are still delivered to the Foxglove in-order.
///
/// Note: this only applies to messages under 15kb, larger messages will use Reliable transport.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning on dispatching based on message size? Or is this more saying "you must use Reliable transport if message size exceeds 15K"?

Copy link
Contributor Author

@eloff eloff Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do either. I think dynamic dispatch on message size is the best user experience (at least when not limited by network bandwidth), so I lean that way.

sinks: LogSinkSet,
closed: AtomicBool,
#[allow(dead_code)]
transport_kind: TransportKind,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of exposing this at the top-level of a Channel, I wonder if we nest this in a TransportOptions struct, so the Channel API doesn't need to change as we introduce new QoS and transport configuration settings. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good idea

/// Options that control how data for the channel is sent over the network with applicable sinks.
#[derive(Debug, Clone, Copy, Default)]
#[repr(u8)]
pub enum TransportKind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming: TransportReliability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or TransportDelivery? Naming things...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I lean towards TransportReliability. TransportKind and TransportDelivery are a bit too generic for my taste; that could encapsulate things like History and Durability, which are orthogonal to this axes of delivery.

sinks: LogSinkSet,
closed: AtomicBool,
#[allow(dead_code)]
transport_kind: TransportKind,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this belongs on the channel at all. Transport options are particular to a (channel, sink) pair. I'd prefer that we rework this along the lines of sink channel filters, but of course only for those sinks that can offer differentiated QoS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that, it's specific to the (channel, sink) pair. But, it's imperative that it be efficient to access. If it requires a hash table lookup on each log call, I think it's too much. Given that, I think the Channel gives a logical and efficient place to store the setting, and sinks are free to ignore it unless they need it.

Can you think of another way we could implement this efficiently?

Copy link
Contributor

@gasmith gasmith Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really worried about a hashmap lookup on the fast path. If this becomes a measurable bottleneck, and we want to optimize, we could potentially store it during channel subscription, as part of the LogSinkSet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind it's not just a hashmap lookup, but taking a read() lock on a RwLock, doing the lookup, and releasing the lock. About ~400 cycles as a rough estimate, depending on caching and architecture.

At 10k log messages/sec it'd be around 4 million cycles / sec just for checking a flag, which could have been a pointer indirection. If we don't make this part of the public interface, having it on the channel as an implementation detail seems ok to me. Less code to maintain too. We can always refactor it if we want to later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so then let's see what it would take to configure this via the sink, and store it in the LogSinkSet. At this point I care more about interfacing than performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants