-
Notifications
You must be signed in to change notification settings - Fork 1.4k
verify: Support for verifying numberio in read-only verify phase
#2023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
If --io_size > --size, not only just rw workloads, but write-only workloads should also be wrapped around to the initial point. This patch removed `td_rw(td)` checking to also support write-only workloads. If wrap-around to the previously written area, we should consider the overlap risks, so put this condition to `fio_offset_overlap_risk()` also to log io pieces into the rb-tree rather than flist to keep the latest io pieces. Signed-off-by: Minwoo Im <[email protected]>
fio.h
Outdated
| td->o.ddir_seq_add || (td->o.ddir_seq_nr > 1) || | ||
| td->o.io_size > td->o.size) | ||
| td->o.io_size > td->o.size || | ||
| td->o.verify_write_sequence) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the right approach. Fio should reason about the workload to determine whether there is an overlap risk. Whether there is a risk or not should not be independently affected by the verify_write_sequence option that the user can set.
Please fully explicate the reasoning for this change (what type of workloads require this?) and then try to come up with a way to accommodate these workloads that does not involve verify_write_sequence determining overlap risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the right approach. Fio should reason about the workload to determine whether there is an overlap risk. Whether there is a risk or not should not be independently affected by the
verify_write_sequenceoption that the user can set.
Indeed, I think td->o.io_size > td->o.size is enough to detect the workload overlap since if so, same offset I/Os might be overlapped.
Please fully explicate the reasoning for this change (what type of workloads require this?) and then try to come up with a way to accommodate these workloads that does not involve
verify_write_sequencedetermining overlap risk.
(will add the following descriptions to the commit)
Let's say we have a workload writing offsets more than two times (overwrite same offsets). In this case, without numberio verification in the verify header, we can't simply verify the latest written data has been flushed to the media or not. If device returns the previous data even if the latest data has been written successfully, fio verification can't simply detect the data is the latest. To verify the time-considered data integrity in the --verify_only=1 workload, we should turn on the verify_write_sequence even in --verify_only=1 case which overrides the verify_write_sequence to 0.
The following write phases is the example:
write (offset=0, length=4096) -> numberio = 0
write (offset=4096, length=4096) -> numberio = 1
... X number of writes
write (offset=0, length=4096) -> numberio = (X+1) + 1
write (offset=4096, length=4096) -> numberio = (X+1) + 2
After this, verification phase should verify the offset 0 and 4096 as the latest data with numberio. In this case, --verify_only=1 workload can simply read a single lap with the latest data.
This patch is for this kind of workload verification.
|
Wow I happened to be looking at this via #1970 just yesterday. My concern is that this isn't going to work for jobs where the initial job was somehow extended past a single pass without using |
Have not seen #1970, but yes, it seems like the same issue which I would like to resolve (old vs. new)
|
Ah good point but I still think the jobs I posted will fail with your PR... |
@sitsofe , Do you mean verify failure? I ran your script and offset=0x0, 0x1000, 0x2000, 0x0 are written in the given time and read phase reads offset=0x0, 0x1000, 0x2000 with the latest numberio without any verify failure. Write phase Read phase Am I missing something here? [updated] I see. verify phase does not detect the overlap risks so that |
In read-only verify phase, `numberio` in the verify header could not be checked since read-only workload can't simply know the certain sequence of the write commands. But, due to Commit 466a2e6 ("backend: fix zero-numberio in dry-run"), dry-run, which is a pre-session before the actual verify phase to build up read @io_u instances, now builds @io_u's numberio properly so that verify phase simply compare the `numberio` in the read block's verify header. Signed-off-by: Minwoo Im <[email protected]>
cc5ceaf to
36428f2
Compare
|
@sitsofe , thw following patch made:
This is not a clean-state patch, definitely need to revisit: diff --git a/backend.c b/backend.c
index e09c210a..f205ea13 100644
--- a/backend.c
+++ b/backend.c
@@ -1686,10 +1686,14 @@ static int exec_string(struct thread_options *o, const char *string,
*/
static uint64_t do_dry_run(struct thread_data *td)
{
- td_set_runstate(td, TD_RUNNING);
+ struct thread_io_list *s = td->vstate;
+ uint64_t target_numberio = s->numberio;
+
+ td_set_runstate(td, TD_DRY_RUNNING);
while ((td->o.read_iolog_file && !flist_empty(&td->io_log_list)) ||
- (!flist_empty(&td->trim_list)) || !io_complete_bytes_exceeded(td)) {
+ (!flist_empty(&td->trim_list)) || !io_complete_bytes_exceeded(td) ||
+ td->io_issues[DDIR_WRITE] < target_numberio) {
struct io_u *io_u;
int ret;
@@ -1723,6 +1727,8 @@ static uint64_t do_dry_run(struct thread_data *td)
(void) ret;
}
+ td_set_runstate(td, TD_RUNNING);
+
return td->bytes_done[DDIR_WRITE] + td->bytes_done[DDIR_TRIM];
}
diff --git a/fio.h b/fio.h
index 1fda07f8..efa57860 100644
--- a/fio.h
+++ b/fio.h
@@ -713,6 +713,7 @@ enum {
TD_INITIALIZED,
TD_RAMP,
TD_SETTING_UP,
+ TD_DRY_RUNNING,
TD_RUNNING,
TD_PRE_READING,
TD_VERIFYING,
diff --git a/io_u.c b/io_u.c
index 2d5986ec..91786ccb 100644
--- a/io_u.c
+++ b/io_u.c
@@ -390,6 +390,7 @@ static int get_next_seq_offset(struct thread_data *td, struct fio_file *f,
}
}
+retry:
if (f->last_pos[ddir] < f->real_file_size) {
uint64_t pos;
@@ -430,6 +431,10 @@ static int get_next_seq_offset(struct thread_data *td, struct fio_file *f,
*offset = pos;
return 0;
+ } else if (td->runstate == TD_DRY_RUNNING) {
+ f->last_pos[ddir] = f->file_offset;
+ loop_cache_invalidate(td, f);
+ goto retry;
}
return 1;
diff --git a/iolog.c b/iolog.c
index dcf6083c..27e393a4 100644
--- a/iolog.c
+++ b/iolog.c
@@ -306,7 +306,7 @@ void log_io_piece(struct thread_data *td, struct io_u *io_u)
* the rb insert/lookup for handling. Sort writes if we have offset
* modifier which can also create duplicate blocks.
*/
- if (!fio_offset_overlap_risk(td)) {
+ if (!fio_offset_overlap_risk(td) && td->runstate != TD_DRY_RUNNING) {
INIT_FLIST_HEAD(&ipo->list);
flist_add_tail(&ipo->list, &td->io_hist_list);
ipo->flags |= IP_F_ONLIST;
diff --git a/libfio.c b/libfio.c
index 57f3f858..7570d695 100644
--- a/libfio.c
+++ b/libfio.c
@@ -209,7 +209,7 @@ static const char *td_runstates[] = {
const char *runstate_to_name(int runstate)
{
- compiletime_assert(TD_LAST == 12, "td runstate list");
+ compiletime_assert(TD_LAST == 13, "td runstate list");
if (runstate >= 0 && runstate < TD_LAST)
return td_runstates[runstate]; |
vincentkfu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first change to fio_offset_overlap_risk looks good to me.
The second change probably is ok as well. However, I'm not comfortable letting it in because get_next_seq_offset() has to handle many different situations and we have very little testing for multiple files, holed IO, backwards IO, etc.
|
For now I think it's possible to accomplish the aim of this PR by explicitly setting |
Add support for veryfing latest
numberioin the verify header read from written offsets.Previously, in the offline verification, read phase has not supported
numberiocompare sinceit's not that simple to figure out the latest
numberios to the offsets. This patch allowsverifying latest written verify header based on
numberio.Available test cases are the below: