Skip to content

Comments

Add support for multi-process port sharing with CIBIR.#5798

Open
ProjectsByJackHe wants to merge 7 commits intomainfrom
jackhe/sql-cibir-fix-sock-reservation
Open

Add support for multi-process port sharing with CIBIR.#5798
ProjectsByJackHe wants to merge 7 commits intomainfrom
jackhe/sql-cibir-fix-sock-reservation

Conversation

@ProjectsByJackHe
Copy link
Contributor

Description

Fixes #5795

The XDP datapath can be configured to intercept packets based on QUIC Connection ID instead of local port.
This behavior existed in MsQuic but was not heavily exercised until recently.
One issue was that MsQuic always attempted to reserve UDP / TCP sockets for each application server process.
But for multiple server processes that may want to share a single port, we would run into port collision errors.
This PR adds support for CIBIR across multiple processes on the same port and document the behavior

Testing

A new DataPathTest was added.

Documentation

Settings.md

@ProjectsByJackHe ProjectsByJackHe requested a review from a team as a code owner February 18, 2026 02:02
@codecov
Copy link

codecov bot commented Feb 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.79%. Comparing base (ed14762) to head (a0b20b3).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5798      +/-   ##
==========================================
- Coverage   85.99%   84.79%   -1.21%     
==========================================
  Files          60       60              
  Lines       18729    18729              
==========================================
- Hits        16106    15881     -225     
- Misses       2623     2848     +225     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


if (Socket->CibirIdLength) {
//
// Setting SO_REUSEADDR does NOT robustly allow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain in this comment how/why the option is not robust on Windows and how it is robust on Linux?

Does it make sense for this socket option to be set on both (all?) platforms if it has different semantics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing for 1)

For 2), no it does not make sense, but we never ran into problems because it was never exercised.

// Clean up all partially-initialized per-proc sockets since
// we're skipping OS socket creation (XDP-only via CIBIR).
//
for (uint16_t i = 0; i < SocketCount; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fairly duplicative of part of CxPlatSocketContextUninitialize - can it be refactored into a common subroutine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted

uint8_t HasFixedRemoteAddress : 1;
uint8_t RawSocketAvailable : 1;

uint8_t SkipCreatingOsSockets : 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline spacing of these bitfields is inconsistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, will fix all nits once core logic addressed.

docs/Settings.md Outdated
| `QUIC_PARAM_LISTENER_LOCAL_ADDRESS`<br> 0 | QUIC_ADDR | Get-only | Get the full address tuple the server is listening on. |
| `QUIC_PARAM_LISTENER_STATS`<br> 1 | QUIC_LISTENER_STATISTICS | Get-only | Get statistics specific to this Listener instance. |
| `QUIC_PARAM_LISTENER_CIBIR_ID`<br> 2 | uint8_t[] | Both | The CIBIR well-known idenfitier. |
| `QUIC_PARAM_LISTENER_CIBIR_ID`<br> 2 | uint8_t[] | Both | Sets a CIBIR (CID-Based Identification and Routing) well-known identifier. CIBIR does 2 things when set: 1. XDP will now steer packets to the correct process/listener by matching the CIBIR prefix within the packet QUIC Connection ID. 2. In the case of a port collision when reserving OS UDP/TCP sockets, MsQuic will continue with initializing the datapath. If XDP is not available/enabled, then no traffic will flow for the listener that experiences a collision. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If XDP is not available/enabled, then no traffic will flow for the listener that experiences a collision.

Is the collision bubbled up to the app? If not, what is the benefit of allowing a listener to be created that will silently not receive the intended traffic? If it is bubbled up as an error, are we saying we'll create the listener, but it just won't receive traffic, or we'll return an error and no listener?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good points. This PR was opened early so we can unblock SQL. We can flesh out the details more.

I chose the design of logging a warning when XDP is not available/not enabled for ease of debugging / unit testing. But we can find a better solution.

I am leaning towards just straight up failing the binding creation if you have a listener trying to use CIBIR but xdp is not available/enabled.

Copy link
Collaborator

@guhetier guhetier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I am concerned that we keep adding incremental exceptions to the port reservation logic to solve the next issue, but without having a clear design goal.
This code will be hard to maintain and confusing for apps as there is no simple rule about what can be done with ports.

I think we need to take the time soon to come up with a clear story about when can port be shared and when they can't + document it + check we implement it.

@ProjectsByJackHe
Copy link
Contributor Author

In general, I am concerned that we keep adding incremental exceptions to the port reservation logic to solve the next issue, but without having a clear design goal. This code will be hard to maintain and confusing for apps as there is no simple rule about what can be done with ports.

I think we need to take the time soon to come up with a clear story about when can port be shared and when they can't + document it + check we implement it.

I agree. I added a CIBIR.md. We could perhaps put the port reservation logic design in there (not yet done).

@ProjectsByJackHe
Copy link
Contributor Author

In general, I am concerned that we keep adding incremental exceptions to the port reservation logic to solve the next issue, but without having a clear design goal. This code will be hard to maintain and confusing for apps as there is no simple rule about what can be done with ports.

I think we need to take the time soon to come up with a clear story about when can port be shared and when they can't + document it + check we implement it.

please see the updated XDP.md and CIBIR.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Partner-SQL] Multiple Quic listeners needed on same port on Windows

3 participants