Flush RRD only when TXGs contain data#18138
Conversation
7382076 to
dca9998
Compare
amotin
left a comment
There was a problem hiding this comment.
I am not 100% sure dp_dirty_pertxg at this point reliably means there is nothing to be written in this TXG. It may need a deeper look. But yea, this might be the direction.
dca9998 to
eda07a7
Compare
eda07a7 to
ff8f278
Compare
amotin
left a comment
There was a problem hiding this comment.
I don't have other objections, but I still worry about already mentioned dp_dirty_pertxg. For example, will snapshot creation or something similar, working in sync context, trigger the history update?
This change modifies the behavior of spa_sync_time_logger when flushing the RRD database. Previously, once the sync interval elapsed, a flush would always be generated. On solid-state devices, especially when the pool was otherwise idle, this caused disks to wake up solely to write RRD data. Since RRD is best-effort telemetry, this behavior is unnecessary and wasteful. With this change, spa_sync_time_logger delays flushing until a TXG that already contains data is being synced. The RRD update is appended to that TXG instead of forcing the creation of a new write-only TXG. During pool export, flushing is forced regardless of whether the TXG contains user data. At that stage, data durability takes precedence and a write must be issued. Sponsored by: [Wasabi Technology, Inc.; Klara, Inc.] Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
ff8f278 to
8b54145
Compare
There was a problem hiding this comment.
dp_dirty_pertxg is set by dsl_pool_dirty_space() so this should be a reasonable way to quickly check for any dirty data associated with the txg. I believe you're right, it won't account for anything dirtied in syncing context but since this best-effort I don't think that need to hold this up.
|
I think a better way to do it would be to call |
|
Agreed, that would be better. @oshogbo can you look at reworking this. |
|
Actually, let me go ahead and merged this fix as is. It's been tested and resolves the core issue for now. We can refactor is as suggested by @amotin in a future PR to further improve things. |
This change modifies the behavior of spa_sync_time_logger when flushing the RRD database. Previously, once the sync interval elapsed, a flush would always be generated. On solid-state devices, especially when the pool was otherwise idle, this caused disks to wake up solely to write RRD data. Since RRD is best-effort telemetry, this behavior is unnecessary and wasteful. With this change, spa_sync_time_logger delays flushing until a TXG that already contains data is being synced. The RRD update is appended to that TXG instead of forcing the creation of a new write-only TXG. During pool export, flushing is forced regardless of whether the TXG contains user data. At that stage, data durability takes precedence and a write must be issued. Sponsored by: [Wasabi Technology, Inc.; Klara, Inc.] Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com> Closes openzfs#18082 Closes openzfs#18138
Description
This change modifies the behavior of spa_sync_time_logger when flushing the RRD database.
Previously, once the sync interval elapsed, a flush would always be generated. On solid-state devices, especially when the pool was otherwise idle, this caused disks to wake up solely to write RRD data. Since RRD is best-effort telemetry, this behavior is unnecessary and wasteful.
With this change, spa_sync_time_logger delays flushing until a TXG that already contains data is being synced. The RRD update is appended to that TXG instead of forcing the creation of a new write-only TXG.
During pool export, flushing is forced regardless of whether the TXG contains user data. At that stage, data durability takes precedence and a write must be issued.
This fixes #18082
This change was inspired from @amotin in comments #18120.
Sponsored by: [Wasabi Technology, Inc.; Klara, Inc.]
How Has This Been Tested?
I have added logs to check when the database is flushed and what is the size of database.
Types of changes
Checklist:
Signed-off-by.