Skip to content

stat: fix duplicate timestamps in iops/latency/bandwidth logs #982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

pzoxiuv
Copy link

@pzoxiuv pzoxiuv commented May 12, 2020

add_log_sample() checks if the current time is greater than the last log time
plus the sampling window duration ("log_avg_msec"). If so it records the log
entry and sets the "last log time" to the current "last log time" plus
log_avg_msec. This is incorrect in cases where more than log_avg_msec has
elapsed between calls to add_log_sample(). The effect is that the very next
time add_log_sample() is called a new log entry will be recorded, possibly with
the same timestamp as the previous log entry.

This fix sets the last log time to the current time, rounded down to the nearest
multiple of log_avg_msec (e.g. if current time is 203 and log_avg_msec is 50 the
last log time will be set to 200, so the next log entry should be recorded at
time=250). This prevents duplicate log entries.

Fixes #971

add_log_sample() checks if the current time is greater than the last log time
plus the sampling window duration ("log_avg_msec").  If so it records the log
entry and sets the "last log time" to the current "last log time" plus
log_avg_msec.  This is incorrect in cases where more than log_avg_msec has
elapsed between calls to add_log_sample().  The effect is that the very next
time add_log_sample() is called a new log entry will be recorded, possibly with
the same timestamp as the previous log entry.

This fix sets the last log time to the current time, rounded down to the nearest
multiple of log_avg_msec (e.g. if current time is 203 and log_avg_msec is 50 the
last log time will be set to 200, so the next log entry should be recorded at
time=250).  This prevents duplicate log entries.
@sitsofe sitsofe self-assigned this May 13, 2020
@sitsofe
Copy link
Collaborator

sitsofe commented Jun 9, 2020

I don't know if this is sufficient... Don't we have to clamp all sample period starting points downwards including the first one? I'm also wondering when fio handles its own stats whether it accounts for periods where no sampling actually took place (although this isn't an issue for the log generation)...

@pzoxiuv
Copy link
Author

pzoxiuv commented Jun 29, 2020

Don't we have to clamp all sample period starting points downwards including the first one?

I'm not sure what you mean by this?

@sitsofe
Copy link
Collaborator

sitsofe commented Jun 30, 2020

I mean, shouldn't we align the very first sample downwards relative the the starting epoch? If the first second has nothing in it then that is somehow significant...

@ghost
Copy link

ghost commented Nov 5, 2020

I believe this change along with the changes I have proposed in [https://github.com//pull/1114] have resolved the issues we've been seeing with the status interval creep. Could you also review PR# 1114 and comment on the changes?

@sitsofe
Copy link
Collaborator

sitsofe commented Nov 29, 2020

@pzoxiuv Do you think this one is going to be solved by #1119 which essentially absorbs your fix?

@sitsofe
Copy link
Collaborator

sitsofe commented Dec 4, 2020

Closing because this has been superseded by #1119 .

@sitsofe sitsofe closed this Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Repeated timestamps in latency log files
3 participants