-
Notifications
You must be signed in to change notification settings - Fork 74
Pacmanfs #2516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sift_ace2
Are you sure you want to change the base?
Conversation
When using the slirp backend with the e1000 frontend, I otherwise get hangs in readv(), caused by the e1000 emulation not checking whether bytes are available before trying to read them. In particular, that device model expects the recv callback to return 0 if no bytes are available, and with slirp it would end up blocking forever. The virtio device model uses the peek_recvlen to check first, so I didn't notice the problem when implementing the slirp backend. Make the slirp backend more flexible to accommodate e1000. MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D48164 (cherry picked from commit d3bdfa5)
libslirp can invoke a callback when received data is removed from a socket buffer, generally because the guest ACKed some data. Previously it didn't do anything, but it needs to wake up the poll thread to get reasonable throughput. Suppose one is using scp to copy data into a guest filesystem via the slirp backend. Data is received on libslirp's socket, which we poll for data in slirp_pollfd_td_loop(). That data gets buffered in priv->pipe, and eventually is placed in the device model's RX rings by the backend's mevent handler. When implementing TCP, libslirp holds on to a copy of data until it's ACKed by the guest via slirp_send(), at which point it drops that data and invokes the notify callback. The initial implementation of this backend didn't take into account the fact that slirp_pollfds_fill() will not add libslirp's socket to the pollfd set if more than a threshold amount of data is already buffered. Then poll() needs to time out before the backend sends more data to the guest. With a default timeout of 500ms, this kills throughput. Use a pipe to implement a simple in-band signal to the poll thread so that it reacts quickly when more buffer space becomes available. MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D48192 (cherry picked from commit 20a51e6)
The previous implementation implemented hostfwd rules which would allow the host to connect to the guest via a NATed TCP connection. libslirp also permits NAT in the other direction, but this was prevented by bhyve's capsicum sandbox. To make the slirp backend more useful, split the backend out into a separate process which does not enter capability mode if outbound connections are permitted (enabled by setting the new "open" keyword). The process communicates with the bhyve network frontend (typically a virtio network interface) using a unix SOCK_SEQPACKET socket pair. If the bhyve process exits, the helper will automatically exit. Aside from this restructuring, there is not much actual change. Many slirp parameters are still hard-coded for now, though this may change. The "restricted" feature is toggled by the new "open" keyword; in particular, the backend is restricted by default for compatibility with 15.0 and 14.3. Each packet now has to traverse an extra socket, but this overhead should be acceptable given that the slirp backend cannot be said to provide high-performance networking. With iperf3 I can get 4Gbps from the guest to the host on a Zen 4 system. MFC after: 1 month Sponsored by: CHERI Research Centre (EPSRC grant UKRI3001) Differential Revision: https://reviews.freebsd.org/D53454 (cherry picked from commit 0e62ebd)
FreeBSD main has a number of improvements to SEQPACKET sockets which are not yet backported. The main reason to use a SEQPACKET socket is so that the slirp helper is reliably notified when the other end hangs up. Rather than backporting the upstream improvements, switch to a DGRAM socket and use an extra control pipe to find out when the parent bhyve process has exited. No data is written to the pipe, we just use it to poll for POLLHUP. This commit can be reverted once CheriBSD has been synced up to commit 69f61ce or later.
This is unmodified pacmanfs at 98bd8d180343f20cfa80a50eb2c59628cca760f0 from [email protected]:CTSRD-CHERI/SIFT-PACMAN-linux.git
Much of this will be reverted later.
Not just when loaded as module.
This is something our <linux/fs.h> does by default, but both pacmanfs _and_ lkpi need their respective per-vnode data fields.
This includes an ugly hack to vfs_bio.c, to be reverted later.
This undoes part of add42c2.
|
Forgot to mention - this includes some unrelated commits to bhyve and em(4) from upstream; it's because I've forked my branch slightly later. They should disappear at the next merge. |
|
Can you locally do |
I tried, and it gave me merge conflicts: |
|
Oh, you had already rebased onto sift_ace2. Maybe try |
This adds pacmanfs plus necessary linuxkpi bits.