-
Notifications
You must be signed in to change notification settings - Fork 1.1k
enable getSamples and getRow in shrex #4288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
52d1c79 to
079288a
Compare
a04fc0f to
098e1f8
Compare
8f00af9 to
b91fa09
Compare
75cd1cc to
24a414f
Compare
031a56b to
1777240
Compare
11d56ef to
d7e3475
Compare
d7e3475 to
92d502f
Compare
92d502f to
5ef2686
Compare
0d2a5e1 to
59f3d3f
Compare
|
blocked on #4696 |
| _, err := getter.GetSamples(ctx, eh, coords) | ||
| require.Error(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to asset if partial response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current implementation does not support both samples, error. I think, such response are very confusing for the users. Also, error means, that we've requested ALL(or almost all) peers from the peer manager and they have responded with the error.
Made a mistake, naming this test case. My original idea, was to cover the case when request contains both valid and invalid cords, so the getter returns error only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You retry all samples at once and if one failed the attempt will retry all again. Instead client should retry only one failed sample and if one sample is unavailable return those that are available so on next attempts they not longer requested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You retry all samples at once and if one failed the attempt will retry all again
if you are talking about the client, then NO bc execute request contains an infinite for loop that will try to request a sample from an available peer in PeerManager.
So, the request(of a single sample) fails only if context was canceled or deadline exceeded. Do you want me to return a partial response in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. need to return succesful samples similar to existing bitswap getter implementation
| return nil, fmt.Errorf("getting rngdata from accessor: %w", err) | ||
| } | ||
|
|
||
| // TODO(@vgonkivs): This is a temporary solution that will be reworked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworked to what? What makes it remporary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom reader implementation per each container instead of a buffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rewrite this comment or remove it , because it does not provide any information for unaware reader
c73aa81 to
ae9632a
Compare
| coords := []shwap.SampleCoords{ | ||
| {Row: 0, Col: 10}, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: -check that requesting samples with ods_size+1 return error. It will help to capture edge cases with test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should limit users of requesting ods shares only. SampleID contructor was designed to verify against EDS, so I'd suggest to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right. I mean eds_size+1 which is edge case
| _, err := getter.GetSamples(ctx, eh, coords) | ||
| require.Error(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You retry all samples at once and if one failed the attempt will retry all again. Instead client should retry only one failed sample and if one sample is unavailable return those that are available so on next attempts they not longer requested
| return nil, fmt.Errorf("getting rngdata from accessor: %w", err) | ||
| } | ||
|
|
||
| // TODO(@vgonkivs): This is a temporary solution that will be reworked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rewrite this comment or remove it , because it does not provide any information for unaware reader
Co-authored-by: rene <[email protected]>
7efeaa5 to
6f336f8
Compare
| func (rngdata *RangeNamespaceData) WriteTo(writer io.Writer) (int64, error) { | ||
| pbrngData := rngdata.ToProto() | ||
|
|
||
| b, err := pbrngData.Marshal() | ||
| if err != nil { | ||
| return 0, err | ||
| } | ||
| l, err := writer.Write(b) | ||
| return int64(l), err | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From private discussion: Need to split into rows and sent them one by one similar to ND data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it requires changing in proto definition that will break the protocol. Current approach also works. Also NS data is a []RowNsData and RangeNsData contains a set of shares + 2 proofs(that can be empty)
|
|
||
| // AccessorGetter abstracts storage system that indexes and manages multiple eds.AccessorGetter by | ||
| // network height. | ||
| type AccessorGetter interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love this iface for one implementation but i'm also not gonna die on hill
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know but I've added it to be able to properly test some edge cases
What was done:
GetSamples,GetRow, andGetRangeNamespaceData(previously returnedErrOperationNotSupported)getters_shrex_attempts_per_requestmetric that usesrequest_typelabelsexecuteRequestmethod that handles retry logic, peer management, and verification for all request types.