Questions about zil+zio layer #17232
Replies: 3 comments 2 replies
-
Hello, this is a friendly reminder :) |
Beta Was this translation helpful? Give feedback.
-
Hi Dimitra, IIRC ZIL code itself does not serialize its ZIOs, issuing all of them it has as soon as it can. But it heavily relies on general ZIO pipeline rules. For example, in case of indirect ZIL write LWB ZIO must not start its processing until all the data blocks it references are compressed, encrypted, checksumed and allocated, otherwise it won't have the block pointer to include into the LWB. This wait is implemented by When we are talking about two consecutive LWBs, formally they should not depend on each-other for data, so it would be great if they were checksumed in parallel, not only issued to the disks. But I suspect they follow the same rules as any other children and parents, which means the following (parent) LWB ZIO will wait at But I am not happy about this observation, and rather than fixate it that way by introducing the mentioned chaining, I'd honestly prefer the dependency to be disabled somehow to parallelize the processing. Otherwise ZIL write throughput is limited to a checksuming throughput of single CPU, which may be a problem in some configurations. Previously before my work around 2.2 we were limited by memory copy throughput of single CPU (which is not brilliant sometimes), when LWB population was done under ZIL lock. I've fixed the locking back then, so we no longer have that lock contention. But looking on this now, I suspect this ZIO serialization might be the next throughput limitation of synchronous writes, unless I am wrong somehow in all the above. |
Beta Was this translation helpful? Give feedback.
-
Hi, Thanks much for the reply -- some minor questions (also I am using the version with tag: zfs-2.2.4) (1)
Can you clarify what is the difference between: (2) I clearly see in the ZIL codebase that when the Copying the explanatory comment from the code in the file:
Is my understanding here correct in your opinion? |
Beta Was this translation helpful? Give feedback.
-
Hello,
When computing the ZIL block checksum ( using
zio_checksum_compute
, specifically whenchecksum == ZIO_CHECKSUM_ZILOG2
which eventually callsabd_fletcher_4_native
), does this process always occur in a single-threaded context? For example, if we need to flush three log write blocks (LWBs), resulting in the creation of three ZIO events with ordering constraints (due to parent/child dependencies enforced by the ZIL codebase,zil_lwb_write_issue
), are these tasks executed sequentially or in a pipelined manner to maintain dependencies? OR, can the checksums of dependent block be computed in parallel, allowing, let's say, for the second block checksum to finish before the first?I understand there are pipeline stalls to ensure that child nodes complete before their parents (e.g., VDEV I/O start/done/assess). However, it is unclear whether the checksum computation respects this ordering (has pipeline stalls).
The reason I am asking is that I am building a more robust ZIL-chain where each ZIL-block depends on the checksum of the previous block. I am implementing that using a structure for global state in the
zio_checksum.c
file. My implementation assumes that by the time we calculate the checksum of block i, checksums for block i-1 and all preceding blocks have already been computed. Is this assumption correct?Beta Was this translation helpful? Give feedback.
All reactions