feat: EgressTrackingService#246
Conversation
dc58aa6 to
943fbaf
Compare
15a46e8 to
93d4761
Compare
There was a problem hiding this comment.
After seeing this used in practice here's an idea for the interface/design of the batch store, which I suggest we rename to "Journal":
type Journal interface {
// Append adds an entry to the journal
Append(ctx context.Context, entry []byte) error
// List all completed (rotated) segments
ListSegments(ctx context.Context) ([]SegmentInfo, error)
// Get a completed segment by its CID
GetSegment(ctx context.Context, cid cid.Cid) (io.ReadCloser, error)
// Delete a segment by its CID
DeleteSegment(ctx context.Context, cid cid.Cid) error
}
type SegmentInfo struct {
CID cid.Cid
Size int64
EntryCount int
CreatedAt time.Time
FinishedAt time.Time
}
// elsewhere....
journal := NewFSJournal(path, Options{
MaxSize: 100*MB,
OnRotation: func(ctx context.Context, segment SegmentInfo) error {
// Queue for egress service
return jobQueue.Enqueue(ctx, segment)
},
})Suggesting this with the goal of getting to a more generalized implementation.
Currently the Append operation forces the callers to deal with rollover/notification events, which they may not always want to do.
OnRotation defines the callback that fires when rotation happens (like Alans suggestion elsewhere). For the case at hand this means push it to the queue, but for other use cases this could be something else.
Once the job is processed from the queue, and the etracker becomes aware of it, it can then fetch the segment with GetSegment as we do now. Later, we can call (and implement) DeleteSegment for GC. ListSegments becomes our window for debugging/monitoring.
| // non-configurable defaults | ||
| maxRetries := uint(10) | ||
| maxWorkers := uint(runtime.NumCPU()) | ||
| maxTimeout := 5 * time.Second |
There was a problem hiding this comment.
When it comes time to test, specifically around failure cases, we might want these to be configurable. see ReplicatorConfig for an example on how this has been done elsewhere.
There was a problem hiding this comment.
that is true when you are testing the job queue implementation. I just mocked the queue out in the service's tests.
I actually had a first implementation replicating ReplicatorConfig (sorry) and then I ditched it when I saw they values were hardcoded anyway. At least it is more explicit this way. It will be easy to expose these values in config when needed.
|
thanks for the input @frrist. Again, I'm advocating against early generalization. I think getting this to work as soon as possible is more valuable at this stage of the project, and we can always iterate if needed without assuming extra risk/effort doing it then vs. doing it now. |
c0daac9 to
9ba0fa9
Compare
fc36591 to
3bdfe6e
Compare
|
Let's get this merged into #245 and I'll leave any comments there |
Ref: #174 Implement an `EgressBatchStore` that allows appending `space/content/retrieve` receipts to a batch. When the batch is above the max, it is flushed automatically. Things that are missing and will be implemented in follow up PRs: - sending the batch to the egress tracking service in a `space/egress/track` invocation - exposing an `http.FileSystem` and adding the corresponding `http.FileServer` endpoint Once these are done, I'll hook it up in Alan's `space/content/retrieve` handler. --------- Co-authored-by: ash <alan@storacha.network>
Ref: #174
I separated the logic for storing receipt batching and the one sending those batches to the egress tracking service. Now the store just stores the batches, and notifies when they are ready. The new
EgressTrackingServicewill invokespace/egress/trackwith new batches.