Skip to content

Issue 7917: Fix/mfs/error pin fail #8525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

gautammenghani
Copy link

Fixes #7917
Approach:
If the status of a pin is "queued", a goroutine will run for each pin until the status of that pin becomes "Pinned" or "Failed"

@welcome
Copy link

welcome bot commented Oct 22, 2021

Thank you for submitting this PR!
A maintainer will be here shortly to review it.
We are super grateful, but we are also overloaded! Help us by making sure that:

  • The context for this PR is clear, with relevant discussion, decisions
    and stakeholders linked/mentioned.

  • Your contribution itself is clear (code comments, self-review for the
    rest) and in its best form. Follow the code contribution
    guidelines

    if they apply.

Getting other community members to do a review would be great help too on complex PRs (you can ask in the chats/forums). If you are unsure about something, just leave us a comment.
Next steps:

  • A maintainer will triage and assign priority to this PR, commenting on
    any missing things and potentially assigning a reviewer for high
    priority items.

  • The PR gets reviews, discussed and approvals as needed.

  • The PR is merged by maintainers when it has been approved and comments addressed.

We currently aim to provide initial feedback/triaging within two business days. Please keep an eye on any labelling actions, as these will indicate priorities and status of your contribution.
We are very grateful for your contribution!

@aschmahmann aschmahmann added the P3 Low: Not priority right now label Jan 14, 2022
@BigLep
Copy link
Contributor

BigLep commented Apr 22, 2022

@lidel : I know we have this marked as P3. I'm putting this on the board for now, but feel free to remove if you don't think you'll realistically get to it during PR Fridays.

@BigLep BigLep added this to the Best Effort Track milestone Apr 22, 2022
Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are too many issues with this PR, so converting it to draft.


func checkPinStatus(ctx context.Context, node pinMFSNode, svcName string, svcConfig config.RemotePinningService, ps pinclient.PinStatusGetter,
repinInterval time.Duration) {
result := make(chan int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reader for this channel, so the goroutine will block forever when trying to write to it.

result := make(chan int)
go func() {
for {
if ps.GetStatus() != pinclient.StatusFailed && ps.GetStatus() != pinclient.StatusPinned {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the condition of the for loop, not a separate if. Then the else becomes unnecessary.

go func() {
for {
if ps.GetStatus() != pinclient.StatusFailed && ps.GetStatus() != pinclient.StatusPinned {
time.Sleep(repinInterval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to use a time.Timer here, so that select can also wait for context cancelation as well as timeout.

}
if(ps.GetStatus() == pinclient.StatusFailed) {
mfslog.Errorf("pinning to %q failed", svcName)
result <- 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are these results used? If they are, a bool type is more appropriate and an int with a 0 or 1 value.

@@ -150,7 +149,7 @@ func pinAllMFS(ctx context.Context, node pinMFSNode, cfg *config.Config, rootCid
repinInterval = defaultRepinInterval
} else {
var err error
repinInterval, err = time.ParseDuration(svcConfig.Policies.MFS.RepinInterval)
repinInterval, err = time.ParseDuration(svcConfig.Policies.MFS.RepinInterval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run go fmt to remove trailing whitespace, etc.

//Run a goroutine to monitor status of queued pins
if ps.GetPin().GetCid() == cid && ps.GetStatus() == pinclient.StatusQueued {
checkPinStatus(ctx, node, svcName, svcConfig, ps, repinInterval)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we want to start a new goroutine for each pin. I think it would be better to communicate pins to watch to a single goroutine.

@@ -227,6 +225,10 @@ func pinMFS(
pinTime = ps.GetCreated().UTC()
break
}
//Run a goroutine to monitor status of queued pins
if ps.GetPin().GetCid() == cid && ps.GetStatus() == pinclient.StatusQueued {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code above already does if ps.GetPin().GetCid() == cid so this and the above should be under a that single if condition.

@@ -216,6 +213,7 @@ func pinMFS(
pinTime := time.Now().UTC()
pinStatusMsg := "pinning to %q: received pre-existing %q status for %q (requestid=%q)"
for ps := range lsPinCh {
log.Errorf("PS Typ eis is : %T", ps)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not log error here.

@gammazero gammazero marked this pull request as draft December 13, 2024 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 Low: Not priority right now
Projects
No open projects
Status: 🔎 In Review
Development

Successfully merging this pull request may close these issues.

MFS remote pinning policy: log ERROR if pin failed
6 participants