-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make RandomAccess.Read*|Write* methods work with non-seekable files #96711
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @dotnet/area-system-io Issue Detailsfixes #58381
|
f5edfd5
to
7536707
Compare
ex = await Assert.ThrowsAsync<TaskCanceledException>(() => RandomAccess.ReadAsync(writeHandle, GenerateVectors(1, 1), 0, token).AsTask()); | ||
Assert.Equal(token, ex.CancellationToken); | ||
ex = await Assert.ThrowsAsync<TaskCanceledException>(() => RandomAccess.WriteAsync(writeHandle, GenerateReadOnlyVectors(1, 1), 0, token).AsTask()); | ||
Assert.Equal(token, ex.CancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you avoid the duplicated calls by doing sth like this?
Task t = RandomAccess.WriteAsync(writeHandle, GenerateReadOnlyVectors(1, 1), 0, token).AsTask();
Assert.True(t.IsCanceled);
ex = await Assert.ThrowsAsync<TaskCanceledException>(() => t);
Assert.Equal(token, ex.CancellationToken);
int bytesRead = 0; | ||
do | ||
{ | ||
bytesRead += RandomAccess.Read(readHandle, buffer.Slice(bytesRead), fileOffset: 0); // fileOffset NOT set to bytesRead on purpose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be good to throw or break out of the loop if RandomAccess.Read returns 0 to avoid a potential infinite loop?
|
||
ReadExactly(readHandle, buffer, content.Length); // what is required for the above write to succeed | ||
|
||
Assert.Equal(content, buffer.AsSpan(0, content.Length).ToArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we always used AssertExtensions.SequenceEqual
for array comparisons.
byte[] buffer = new byte[content.Length * 2]; | ||
int readFromOffset456 = RandomAccess.Read(readHandle, buffer, fileOffset: 456); | ||
|
||
Assert.InRange(readFromOffset456, 1, content.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the InRange used because it is not right to assume that Read will fill the whole buffer?
{ | ||
byte[] content = RandomNumberGenerator.GetBytes(BufferSize); | ||
Task writeToOffset123 = RandomAccess.WriteAsync(writeHandle, content, fileOffset: 123).AsTask(); | ||
byte[] buffer = new byte[content.Length * 2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is read buffer's size double the size of content?
readOnlyVectors.SelectMany(vector => vector.ToArray()).Take(byteCount), | ||
writableVectors.SelectMany(vector => vector.ToArray()).Take(byteCount)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are missing tests for Get/SetLength.
{ | ||
// We need to fallback to the non-offset version for certain file types | ||
// e.g: character devices (such as /dev/tty), pipes, and sockets. | ||
Interop.ErrorInfo errorInfo = Interop.Sys.GetLastErrorInfo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be easy for a future refactor to introduce something that resets last error before this is called? Maybe last error should be passed in so it's visible.
I assume this PR is about bringing the capability to pass multiple buffers to a read/write with a non-seekable handle? I think it would make sense to add new methods that don't take an offset argument. Rather than try to handle it for the user, I think we can expect the user to not use non -seekable handles with methods that take an offset. We don't need to try to detect and handle it. @adamsitnik what do you think? |
The multiple buffers is a secondary goal, the main one is to allow users to read/write to/from non-seekable files by using only
Yes and no. The main blocker to me is... the name of the type: So we have 3 options right now:
@stephentoub @jozkee what are your preferences here? |
I still prefer (1). We already found it valuable to be able to do that internally, which is why the single buffer methods already support this for internal use. |
I think the The typical user of this API will have a handle that he means to use in a seekable way or not. I prefer (2) because if he does mean to use it in a seekable way, that is clear from the API he is using. What I dislike about (1) is that the API takes an offset parameter and if the handle turns out to be non-seekable, it is just ignored. |
Could a middle-ground be to throw if offset is != 0 on non-seekable handles? |
await Task.WhenAll(server.WaitForConnectionAsync(), client.ConnectAsync()); | ||
|
||
bool isAsync = (PipeOptions & PipeOptions.Asynchronous) != 0; | ||
return (GetFileHandle(server, isAsync), GetFileHandle(client, isAsync)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Unix, the handles are set to non-blocking when they are used in an async operation.
Because these handles weren't used yet, the handles are still blocking when PipeOptions.Asynchronous
is set in RandomAccess_NonSeekable_AsyncHandles
.
If the handles were non-blocking, some of the reads/writes should fail with EAGAIN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When making the handles non-blocking by adding:
if (isAsync)
{
await client.WriteAsync(new byte[1]);
await server.ReadAsync(new byte[1]);
}
6 of the RandomAccess_NonSeekable_AsyncHandles
tests fail with:
Error Message:
System.IO.IOException : The process cannot access the file because it is being used by another process.
This is the EAGAIN
that is surfacing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the EAGAIN that is surfacing.
Great catch! In such case, we should most likely do a bigger refactor and use the epoll code path here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had some discussions in the past of making the epoll path more generic and usable elsewhere.
So far we've avoided the effort and used it as-is (like in #44647).
That's reasonable. |
The problem is that sometimes you don't expect that given file is non-seekable (you get a path and you don't know what it is exactly) and following code could fail: int index = 0;
int count = requestedSize;
byte[] bytes = new byte[count];
while (count > 0)
{
int n = RandomAccess.Read(sfh, bytes.AsSpan(index, count), index);
index += n;
count -= n;
} |
Perhaps the situation is different on Windows, but in contrast to using |
From my perspective:
|
I didn't mean to suggest that
Yes, these APIs avoid a user having to create a class. My comment was meant to challenge if this is much of a problem.
Non-seekable handles are limited compared to seekable handles. I think it would be good if the API shows that it will work with non-seekable handles. We're blurring this distinction by making users pass an offset which they should expect to be ignored. |
@adamsitnik, what are you planning to do with this PR? Do we want it in .NET 9? |
I wanted to have it merged for 9, but as @tmds has pointed in #96711 (comment) the current implementation is not going to work for async handles on Linux. Making it work is possible, but would require a lot of work (mostly refactoring the epoll-related code and re-using it here). On the other hand, we could just document this limitation as currently not supported and call it a day. @stephentoub what is your opinion? |
Today the methods don't work with any non-seekable file descriptors. With this change, as far as I can tell we're making the existing methods work with more file descriptors and without any take backs. Even if there are still more beyond that we could support in the future, this seems to me like moving in the right direction. Up to you what you want to do with it. |
fixes #58381