Various fixes for working rtwn #2385

jrtc27 · 2025-04-01T22:11:25Z

rtwn: ensure TX work isn't scheduled during reset / abort
rtwn: change the USB TX transfers to only do one pending transfer per endpoint
rtwn: enable FCS in the recive config to work around truncated frames
rtwn: don't do 64 bit TSF extension by default
rtwn: make sure RCR_APPFCS stays set in monitor mode / mode changes.
net80211: crypto: ccmp: fix more hardware offload bits
net80211: validate control frame TA/RA before further processing
rtwn: enable reception of BAR frames
rtwn: Disable IP/TCP/UDP checksum offloading
rtwn: Add missing subobject bounds annotations
net80211: Don't bound structs that extend radiotap headers
net80211: Fix subobject bounds issue in IEEE80211_IOC_CHANINFO

Don't schedule work during reset / abort. For USB NICs, work must not be scheduled during a call to rtwn_usb_abort_xfers(), as then it'll cause the call to usbd_transfer_drain() to hang. This fixes a hang I've been seeing where the NIC hits a TX timeout and then the reset/re-init path is called. If data is scheduled to be transmitted in that window, the call to usbd_transfer_drain() would hang and require a hard reboot to recover. Differential Revision: https://reviews.freebsd.org/D47479 (cherry picked from commit 8838f3c32ac0ebcb8b20863f8c455375039a505e)

… endpoint I found I was getting constant device timeouts when doing anything more complicated than a single SSH on laptop with RTL8811AU. After digging into it, i found a variety of fun situations, including traffic stalls that would recover w/ a shorter (1 second) USB transfer timeout. However, the big one is a straight up hang of any TX endpoint until the NIC was reset. The RX side kept going just fine; only the TX endpoints would hang. Reproducing it was easy - just start up a couple of traffic streams on different WME AC's - eg a best effort + bulk transfer, like browsing the web and doing an ssh clone - throw in a ping -i 0.1 to your gateway, and it would very quickly hit device timeouts every couple of seconds. I put everything into a single TX EP and the hangs went away. Well, mostly. So after some MORE digging, I found that this driver isn't checking if the transfers are going into the correct EPs for the packet WME access category / 802.11 TID; and would frequently be able to schedule multiple transfers into the same endpoint. Then there's a second problem - there's an array of endpoints used for setting up the USB device, with .endpoint = UE_ADDR_ANY, however they're also being setup with the same endpoint configured in multiple transfer configs. Eg, a NIC with 3 or 4 bulk TX endpoints will configure the BK and BE endpoints with the same physical endpoint ID. This also leads to timed out transfers. My /guess/ was that the firmware isn't happy with one or both of the above, and so I solved both. * drop the USB transfer timeout to 1 second, not 5 seconds - that way we'll either get a 1 second traffic pause and USB transfer failure, or a 5 second device timeout. Having both the TX timeout and the USB transfer timeout made recovery from a USB transfer timeout (without a NIC reset) almost impossible. * enforce one transfer per endpoint; * separate pending/active buffer tracking per endpoint; * each endpoint now has its own TX callback to make sure the queue / end point ID is known; * and only frames from a given endpoint pending queue is going into the active queue and into that endpoint. * Finally, create a local wme2qid array and populate it with the endpoint mapping that ensures unique physical endpoint use. Locally tested: * rtl8812AU, 11n STA mode * rtl8192EU, 11n STA mode (with diffs to fix the channel config / power timeouts.) Differential Revision: https://reviews.freebsd.org/D47522 (cherry picked from commit d99eb8230eb717ab0b2eba948614d0f2f2b5dd2b)

I noticed that on RTL8812AU/RTL8821AU receiving VHT frames that I'd occasionally see frames missing the last 4 bytes. I can easily reproduce it with a ping sweep and fast (10ms) between frames. There's also a report of an earlier NIC (RTL8188EU) doing the same thing with HT frames but not with OFDM (11g) frames. After a bunch of poking, it turns out a driver where things DID work properly for the other report kept FCS enabled, and trimmed it from the frame before pushing it up to the network layer. I did the same and it also worked fine. The other solution was to disable PHYSTATUS notifications, but then we'd get no per packet RX notifications (RX rate, RSSI, etc.) Locally tested: * RTL8192EU, STA mode (HT) * RTL8812AU, STA mode (HT, VHT) * RTL8821AU, STA mode (HT, VHT) Differential Revision: https://reviews.freebsd.org/D47775 (cherry picked from commit d76247e801dec95600a068fa1bb09f4f57e00031)

The TSF64 extension involves at least 3 reads from TSF registers (R92C_TSFTR(0), R92C_TSFTR(1), R92C_TSFTR(2)) which are 4 byte control transfers. They take up valuable USB link time. It's likely much less expensive for PCIe adapters. At some point it may be worthwhile enabling it by default just for those. With this disabled, the only USB traffic that I see during normal data operation are bulk TX/RX data transfers for 802.11 packets, and on NICs w/ net80211 rate control, the control register space read/writes for TX completion. (And that will also need addressing.) This is the difference between 15mbit TCP RX and 30mbit TCP RX on the 11n NICs, and around 40 to 50mbit TCP RX on the 11ac NICs in HT40 and VHT80. Locally tested: * RTL8188EU, STA mode * RTL8192CU, STA mode * RTL8192EU, STA mode * RTL8811AU, STA mode * RTL8821AU, STA mode Differential Revision: https://reviews.freebsd.org/D47861 (cherry picked from commit fcb5e8d0c19ac21515ab3047d39a76b32d835cec)

My previous commit meant that APPFCS wasn't enabled during monitor mode and likely other corner cases. Ensure it stays on at all times. This, amusingly, fixes monitor mode in RTL8812AU/RTL8821AU - without it, I don't see HT/VHT frames in monitor mode but I can still receive them in normal STA mode. Differential Revision: https://reviews.freebsd.org/D48112 (cherry picked from commit 791170aaf7efb4e053ccbf537d80a43e8a81d1e4)

Add the missing IEEE80211_RX_F_DECRYPTED and IEEE80211_RX_F_MMIC_STRIP (really just MIC_STRIP) checks to make hwaccel offload work. This makes rtw8x drivers pass RX packets again at least with LinuxKPI if HW_CRYPTO support is enabled. Sponsored by: The FreeBSD Foundation MFC after: 3 days Reviewed by: adrian Differential Revision: https://reviews.freebsd.org/D49030 (cherry picked from commit 3afc0bfecb1a927c37672dc245688c575e4d9ec4)

Summary: An earlier commit relaxed the TA/RA rules around control frames to fix other issues, however it now results in control frames not specifically from a known node / to us to be handled in the control path. Specifically, rtwn(4) RTL8812/RTL8821 NICs are currently passing BARs from the AP TA to any destination to us; which is tripping up BAW tracking and causing traffic hangs. Reviewers: #wireless, jrtc27, bz Subscribers: bz, imp, melifaro, glebius Tags: #wireless Differential Revision: https://reviews.freebsd.org/D49575

Summary: The RX filter wasn't enabling BAR frames, so we weren't receiving them during normal operation. Jessica noticed we WERE getting BAR frames, but only when promisc mode is active. Which is a different set of bugs, but it did highlight the differences here. PR: kern/285822 Reviewers: #wireless, jrtc27 Subscribers: imp Tags: #wireless Differential Revision: https://reviews.freebsd.org/D49596

I have consistently seen false-positive checksum failures for UDP/IP packets with a missing, i.e. zero, checksum (specifically, for packets from the UniOfCam-IoT network's DHCP server), but valid checksums reported for UDP/IP packets with a non-zero checksum. Therefore I believe that the hardware does not correctly handle the optional nature of UDP/IP's checksums, so ignore the result and let the network stack validate the checksum itself. Neither Linux nor OpenBSD seems to use this feature that I can see. Whilst there is the old Linux vendor driver that defines macros for these specific fields, it too does not check them.

65fdef2 ("rtwn: Add subobject bounds annotations for softc subclasses.") caught some of these but not all. This allows rtwn to attach to a TP-Link Archer T2U Nano (RTL8821AU).

These are intended to be embedded, not used standalone, with a pointer to the inner ieee80211_radiotap_header member passed to net80211 functions that is then used to walk off the end of the header into the larger containing struct. Whilst ieee80211_radiotap_vendor_header doesn't appear to be problematic today, it's still intended to be embedded as a header, and it seems likely that a similar access pattern (rather than always being derived from a ieee80211_radiotap_header *) could appear, so keep it consistent with the non-vendor one.

We fault on the kernel access into the array after ic_nchans here which manifests as EFAULT returned back to userspace.

Adrian Chadd and others added 12 commits April 1, 2025 22:15

rtwn: Add missing subobject bounds annotations

e1aa478

65fdef2 ("rtwn: Add subobject bounds annotations for softc subclasses.") caught some of these but not all. This allows rtwn to attach to a TP-Link Archer T2U Nano (RTL8821AU).

net80211: Fix subobject bounds issue in IEEE80211_IOC_CHANINFO

d9e31ac

We fault on the kernel access into the array after ic_nchans here which manifests as EFAULT returned back to userspace.

brooksdavis merged commit 2339ee8 into dev Apr 2, 2025
29 checks passed

brooksdavis deleted the rtwn branch April 2, 2025 16:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Various fixes for working rtwn #2385

Various fixes for working rtwn #2385

Uh oh!

jrtc27 commented Apr 1, 2025

Uh oh!

Uh oh!

Uh oh!

Various fixes for working rtwn #2385

Various fixes for working rtwn #2385

Uh oh!

Conversation

jrtc27 commented Apr 1, 2025

Uh oh!

Uh oh!

Uh oh!