-
Notifications
You must be signed in to change notification settings - Fork 636
Various fixes related to makePromise() pointing behind EOF #4266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
If indices passed to makePromise() are invalid (e.g. point behind EOF) and this is detected during pushArea(), don't attempt to perform any parsing with a subparser and discard the promise completely.
When a position passed to mio_seek() is behind EOF, set the internal pointer to the position corresponding to EOF. This is how normal fseek() behaves on my machine (probably because it first has to read the necessary amount of bytes from the stream to get to the right position). This could help in situations when the return value of mio_seek() isn't checked and EOF position corresponds "more closely" to the (invalid) position passed as the parameter instead of staying at the same position. Note that when using negative offset and the result is before the beginning of the file, fseek() doesn't move its internal position and stays at the same position (probably because it already knows how many bytes have been read). So this case hasn't been updated in this patch (though it might be worth considering).
Note that the MIO patches are more or less a "bonus" and not really essential. I think, regardless of what the exact problem of the offset miscalculation is, |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4266 +/- ##
==========================================
- Coverage 85.95% 85.95% -0.01%
==========================================
Files 246 246
Lines 63434 63445 +11
==========================================
+ Hits 54525 54532 +7
- Misses 8909 8913 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
See also #4267 that adds a simpler version of the test case mentioned above. It seems that any CRLF that triggers a sub-parser will reveal the issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this change is not a good idea. I see a few issues, which basically stem from the C standard.
First, the C standard does not specify this behavior. Quoting C17 N2176:
Description
The
fseek
function sets the file position indicator for the stream pointed to bystream
. If a read or
write error occurs, the error indicator for the stream is set andfseek
fails.For a binary stream, the new position, measured in characters from the beginning of the file, is
obtained by addingoffset
to the position specified bywhence
. The specified position is the
beginning of the file ifwhence
isSEEK_SET
, the current value of the file position indicator if
SEEK_CUR
, or end-of-file ifSEEK_END
. A binary stream need not meaningfully supportfseek
calls
with awhence
value ofSEEK_END
.For a text stream, either
offset
shall be zero, oroffset
shall be a value returned by an earlier
successful call to theftell
function on a stream associated with the same file andwhence
shall be
SEEK_SET
.After determining the new position, a successful call to the
fseek
function undoes any effects of the
ungetc
function on the stream, clears the end-of-file indicator for the stream, and then establishes
the new position. After a successfulfseek
call, the next operation on an update stream may be
either input or output.
With After determining the new position, a successful call to the fseek
function […] then establishes
the new position, one could even read that it should not change the position if the call is not successful (and can it be if it's not actually honored?).
But at any rate, as there is nothing specifying this behavior, it can't be relied upon for FILE-based streams, and thus I don't see any value of changing the behavior for memory streams, it just adds a more or less random behavior for a specific case. If it's not standard, the caller shouldn't rely on it, or it's gonna be brittle.
Additionally, it's not something my libc does. I doesn't behave like you suggest (here I have glibc 2.36):
Using the very crude
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv)
{
FILE *fp = fopen(argv[1], "rb");
if (! fp)
return 1;
long off = atol(argv[2]);
fprintf(stderr, "initial position: %ld\n", ftell(fp));
fprintf(stderr, "seeking to %ld: %d\n", off, fseek(fp, off, SEEK_SET));
fprintf(stderr, "ftell(): %ld\n", ftell(fp));
fprintf(stderr, "seeking to +32: %d\n", fseek(fp, 32, SEEK_CUR));
fprintf(stderr, "ftell(): %ld\n", ftell(fp));
fprintf(stderr, "seeking to end: %d\n", fseek(fp, 0, SEEK_END));
fprintf(stderr, "ftell(): %ld\n", ftell(fp));
fclose(fp);
return 0;
}
and running it:
$ ./seek.out seek.c 1000
initial position: 0
seeking to 1000: 0
ftell(): 1000
seeking to +32: 0
ftell(): 1032
seeking to end: 0
ftell(): 613
You can clearly see that it sought (hehe, seek()
ed) past the end: when seeking to SEEK_END
, I get 613, but if I seek to 1000 I get that 1000 (which is then past the end), and I can even further SEEK_CUR
happily.
So in the end I'd rather not have this change any internal on failure, and leave it to the caller. Originally MIO was designed to mimic as closely as possible the equivalent C library calls, not adding any fanciness anywhere it, and I don't think it should deviate to what can be expected from the C library.
At some point it had tests to compare against the C library itself, maybe we should re-introduce this in uctags -- not that it would catch anything here, as the test would have not to depend on unspecified behavior, but we could verify it's not diverging on specified behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it was just an idea inspired by the fact that ctags uses mio_seek() and mio_setpos() without any checks of the returned value and that I thought that in the error cases it would be better to be "closer" to the provided value. The right thing of course is to properly check the return values and recover accordingly.
Let's discard this patch.
Apart from the remark above, looks reasonable. I'll have to test it properly with a clean head though, it's 1:20am here and I'm not fresh anymore -- but what's left of my brain like the other changes 🙂 |
… behind EOF" This reverts commit 1816905.
I've added a few more checks inside pushArea() - I believe all the |
For more info, see the individual commit messages and the discussion in geany/geany#4303
To reproduce the crash, use geany/geany@79fb415 from geany/geany#4303. Note that to reproduce this issue, it has to use CRLF line endings (might get lost when copy/pasting from the github web interface).
The issue can be reproduced using
ctags --extras=+rg -o- wp-guest.php
.