tcpstats(4): add TCP connection statistics character device#2079
tcpstats(4): add TCP connection statistics character device#2079randomizedcoder wants to merge 1 commit intofreebsd:mainfrom
Conversation
|
Thank you for taking the time to contribute to FreeBSD! Some of files have special handling: Important @concussious wants to review changes to share/man/ Important @ngie-eign wants to review changes to tests |
Add a loadable kernel module that creates a read-only character device /dev/tcpstats for streaming per-connection TCP statistics to userspace. Each read iterates every TCP connection in a single kernel pass and emits fixed-size 320-byte records via uiomove(9). Features: - Single-pass INP_ALL_ITERATOR with INPLOOKUP_RLOCKPCB for consistent snapshots - Filter system supporting ports, TCP states, IPv4/IPv6 CIDR, field selection - Named filter profiles via sysctl with per-profile device nodes - Five-layer security model (permissions, credential checks, resource limits) - DoS protections: FD/reader limits, read timeout, signal interruptibility - VNET-aware for jail compatibility - Optional DTrace SDT probes and sysctl statistics counters Tested on FreeBSD 14.3, 14.4, and 15.0 (amd64 GENERIC) with zero compiler warnings under -Werror. Includes 37 ATF C tests for the filter parser and 3 ATF shell tests for kmod lifecycle verification. Signed-off-by: Dave Seddon <dave.seddon.ca@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e7625c9 to
b631f15
Compare
|
Have you considered importing this module as a port? |
|
Is there not some indirect functional overlap here with BBLog and generic-ebpf (potential future merge candidate) ? I realise that has been called out in the initial comments for the submission, but the Project has other concerns regarding the authorship and provenance of this submission. I intend to extend BBLog to do Delayed ACK profiling. It isn't clear if there is benefit here in either direction. I eyeballed bsd-xtcp very quickly and don't see profiling for the D-ACK vs Nagle implosion going on. xnu itself does already do this profiling. So this submission might be better termed a "socket scraper", really. |
|
Thanks for the replies guys. Oh, sorry, I didn't realize ports also has kernel modules, but I see it does. I'm not really sure on the "rules" for the decision for a module to be in freebsd-src, or in the freebsd-ports. I'm happy to close this and move it to ports if that makes more sense, but I wouldn't really call this a "port", cos this is only for freebsd. As you probably know, Linux uses netlink for the tcp-diag stuff, so this isn't a "port" from Linux. |
You're right, sometimes it's a gray area when deciding between ports and src. I suggest you create a port instead. Please refer to the Porter's Handbook: Add me into the CC of your PR on bugzilla. |
Kernel module providing system-wide TCP socket statistics via /dev/tcpstats. Streams fixed-size 320-byte records with per-connection TCP metrics including addresses, ports, TCP state, congestion control parameters, RTT measurements, retransmit counts, ECN statistics, and process attribution. Follows feedback from freebsd-src PR freebsd/freebsd-src#2079 to move to ports. Tested on FreeBSD 15.0, 14.4, and 14.3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Closing this PR in favor of freebsd/freebsd-ports#497 |

PR: Add tcpstats(4) — TCP Connection Statistics Character Device
Summary
This PR adds
tcpstats, a kernel module that creates a read-only character device/dev/tcpstatsfor streaming per-connection TCP statistics to userspace. When read, the device iterates every TCP connection in a single kernel pass and emits fixed-size 320-byte records viauiomove().Why not existing tools?
netstat -an— userspace tool, no per-connection metrics (cwnd, RTT, retransmits, ECN), high syscall overheadsiftr(4)— logs to file on packet events, not on-demand snapshots; no filtering; no structured binary outputtcp_blackbox— ring-buffer trace logging for debugging, not real-time monitoringkern.ipc.tcp_pcblistsysctl — returnsxinpcbstructs but requires complex userspace parsing and multiple syscallstcpstatsprovides a purpose-built, zero-copy, filterable, binary-stable interface for TCP connection monitoring.Design Overview
Video
YouToob video: https://youtu.be/e7uPr9q4Lmg
Original repo
This kernel module was originally developed and extensively tested in this repo:
https://github.com/randomizedcoder/bsd-xtcp
Architecture
/dev/tcpstats(compact default fields),/dev/tcpstats-full(all fields)INP_ALL_ITERATORwithINPLOOKUP_RLOCKPCBfor consistent snapshotstruct tcp_stats_recordwith 52-byte spare for future expansion/dev/tcpstats/<name>device nodeCURVNET_SET/CURVNET_RESTOREfor jail compatibilitySecurity Model (5 layers)
EPERMon write opencr_canseeinpcb()per connectiondev.tcpstats.max_open_fds(default 16) →EMFILEdev.tcpstats.max_concurrent_readers(default 32) →EBUSYDoS Protections
dev.tcpstats.max_read_duration_ms(default 5000ms)dev.tcpstats.min_read_interval_msSIGPENDING(curthread)every 1024 socketskern_yield(PRI_USER)every 1024 socketsABI Stability
_Static_assert'd to 320 bytes at compile timetsr_version) allows future evolutionTSF_VERSION) for forward compatibilityFilter System
The filter parser is a dual-compile module (kernel and userspace) supporting:
ipv4_only/ipv6_onlyflags with conflict detectionAll directives are ANDed. Empty/zero fields mean "match any".
Static Analysis Results
The following analysis tools were run on the original out-of-tree module with zero warnings:
Memory Safety Verification
ATF Tests
Filter Parser Tests (
tests/sys/netinet/tcpstats_filter_test.c)37 ATF C test cases covering the userspace/kernel dual-compile filter parser:
::Compiled in userspace linking against
tcp_statsdev_filter.candlibprivateatf-c.Kernel Module Lifecycle Tests (
tests/sys/netinet/tcpstats_test.sh)3 ATF shell test cases (require root):
kmod_load_unloadkldload,kldstat -q -m,/dev/tcpstatschar device exists,kldunloaddev_readabledd if=/dev/tcpstatssucceeds after loadsysctl_existsmax_open_fds,max_concurrent_readers,max_read_duration_ms,reads_total,active_fdsall present24-Hour Soak Test Results
Configuration
Memory
M_TCPSTATSmalloc type:Use=0,Memory=0at every sampleStability
Counters
uiomove()errorsvisited == emitted + sum(skipped)held at every samplePerformance Benchmarks
Filter Parser
Read Path
Concurrent Readers
DTrace Probe Verification
7 SDT probes registered and firing (when compiled with
-DTCPSTATS_DTRACE):tcpstats:::read-entrytcpstats:::read-donetcpstats:::filter-skiptcpstats:::filter-matchtcpstats:::fill-donetcpstats:::profile-createtcpstats:::profile-destroyVM Build & Test Results
Built and tested on two FreeBSD VMs with partial source trees (
/usr/src/sysonly). Module sources were rsynced and build system files patched in-place.Build
-Werror, gnu17tcpstats.ko(48184 bytes)-Werror, gnu99tcpstats.ko-Werror, gnu99tcpstats.ko(48528 bytes)Smoke Test (load / read / unload)
All three platforms:
kldloadsucceeds,kldstat -vshows module loaded/dev/tcpstatscreated ascr--r----- root:networkdev.tcpstats.*dd if=/dev/tcpstats bs=320 count=100reads valid binary records with correct headerkldunloadsucceeds cleanlydmesgshowstcpstats: loaded (TCP_STATS_VERSION=1, TSF_VERSION=2)/tcpstats: unloadeddmesgATF Filter Parser Tests (37 test cases)
ATF Shell Tests (3 test cases)
Summary
tcpstats.ko,-Werror)/dev/tcpstatscreatedFiles Changed
New Files (9)
sys/netinet/tcp_statsdev.csys/netinet/tcp_statsdev.hstruct tcp_stats_record, ioctlssys/netinet/tcp_statsdev_filter.csys/netinet/tcp_statsdev_filter.hsys/modules/tcpstats/Makefileshare/man/man4/tcpstats.4tests/sys/netinet/tcpstats_filter_test.ctests/sys/netinet/tcpstats_test.shPR_SUBMISSION.mdModified Files (5)
sys/modules/MakefiletcpstatstoSUBDIRsys/conf/filestcp_statsdev.candtcp_statsdev_filter.csys/conf/optionsTCPSTATSoptionshare/man/man4/Makefiletcpstats.4toMANlisttests/sys/netinet/Makefile