Skip to content

Returns structured errors from FundPsbt #5436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

arshbot
Copy link
Contributor

@arshbot arshbot commented Jun 25, 2021

Converts errors likely to be thrown from string based errors to error
classes with returned grpc error codes. Addresses #5411

Pull Request Checklist

  • If this is your first time contributing, we recommend you read the Code
    Contribution Guidelines
  • All changes are Go version 1.12 compliant
  • The code being submitted is commented according to Code Documentation and Commenting
  • For new code: Code is accompanied by tests which exercise both
    the positive and negative (error paths) conditions (if applicable)
  • For bug fixes: Code is accompanied by new tests which trigger
    the bug being fixed to prevent regressions
  • Any new logging statements use an appropriate subsystem and
    logging level
  • Code has been formatted with go fmt
  • Protobuf files (lnrpc/**/*.proto) have been formatted with
    make rpc-format and compiled with make rpc
  • New configuration flags have been added to sample-lnd.conf
  • For code and documentation: lines are wrapped at 80 characters
    (the tab character should be counted as 8 characters, not 4, as some IDEs do
    per default)
  • Running make check does not fail any tests
  • Running go vet does not report any issues
  • Running make lint does not report any new issues that did not
    already exist
  • All commits build properly and pass tests. Only in exceptional
    cases it can be justifiable to violate this condition. In that case, the
    reason should be stated in the commit message.
  • Commits have a logical structure according to Ideal Git Commit Structure

@arshbot arshbot force-pushed the structure-fundpsbt branch from 4370424 to ea550e1 Compare June 25, 2021 21:29
Converts errors likely to be thrown from string based errors to error
classes with returned grpc error codes
@arshbot arshbot force-pushed the structure-fundpsbt branch from ea550e1 to 4a7356f Compare June 25, 2021 21:44
@@ -36,8 +38,7 @@ func verifyInputsUnspent(inputs []*wire.TxIn, utxos []*lnwallet.Utxo) error {
}

if !found {
return fmt.Errorf("input %d not found in list of non-"+
"locked UTXO", idx)
return status.Error(codes.NotFound, ErrUnspentInputNotFound(idx).Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will return a structured error to the client and they still need to parse the text message to find out the index?

Copy link
Member

@Roasbeef Roasbeef Jun 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends what you mean by a structured error. On the client side, they can introspect the error and extract the error code used and act upon then, possibly doing string parsing if they need any additional context.

Unless you mean return a response that has fields to enumerate the different types of errors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some digging in the docs, and looks like it's possible for us to actually get the best of both worlds here: https://pkg.go.dev/google.golang.org/grpc/internal/status?utm_source=godoc#Status.WithDetails

This API lets you make a status code to return using status.Error, but then also attach an arbitrary proto message that can be used to let the client optionally get more structured information if it needs to. This blog post has a good overview of how things would work end to end: https://jbrandhorst.com/post/grpc-errors/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, WithDetails is what I used before indeed for that. I think that can come in useful in various places, because not everything maps cleanly to a grpc code. And in this case, the index needs to be stored somewhere. If an utxo is locked, it is likely that a client needs info about which one it is exactly to do another funding attempt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the response provide key value information per error type (if index not found, return missing index via key) or should we go with more generic error codes with arbitrary information? The issue of having systems respond appropriately to failures is addressed with the generic error codes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If think the requirement here is that systems should be able to respond appropriately to a specific utxo being unavailable. Just a generic error code wouldn't suffice then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about the WithDetails, could be useful indeed. Though the fact that it needs to be a proto message would mean we'd need to add new messages to our main proto file just for structured errors? Might get bloated quite quickly.

Also, I wonder how such an error message (with details) would look on the command line, if the error is just printed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A problem to consider would also be the scope of structured errors, and the lack of unity between different error cases. FundPSBT has many unique error cases, while only 2 in particular are covered by this ticket.

@guggero the error should be printed in json to the client imo, as the purpose is for machine consumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be many unique error cases, but all the 'unexpected' ones - the internal errors - don't need to be returned in a structured way I'd say. I am not sure if that many structured errors remain.

Using status.WithDetails, we attach more metadata which is then printed
as json in certain cases for machine consumption.

Added the errdetails package to provide more robust error coding
@arshbot arshbot force-pushed the structure-fundpsbt branch from 92c8c0d to a842fcd Compare July 3, 2021 16:10
This commit compiles the grpc protos since additional output is added to
the FundPSBT command
Adds a test to ensure errors are thrown when utxos with incorrect
indices are provided to FundPSBT.
@Roasbeef Roasbeef added this to the v0.14.0 milestone Aug 31, 2021
@Roasbeef Roasbeef requested a review from guggero August 31, 2021 00:57
@Roasbeef Roasbeef added the P3 might get fixed, nice to have label Aug 31, 2021
@guggero
Copy link
Collaborator

guggero commented Aug 31, 2021

I think the idea of starting to return more useful information with errors is a really good one!

There are a few things I personally don't like about the current approach:

  • We need a new error type/struct for each specific case.
  • We rely on the googleapis messages which might not fit our use cases very well.
  • The client needs to to a type switch to interpret the error details.
  • We use a custom error message in the detail that just annotates another reason string. This could be done through a custom error code.

But I think the ideas in this PR are good and should be taken into account.
So here is my counter proposal:

  1. We unify this PR with rpcperms: add gRPC codes to errors #5633 which proposes to use the gRPC status code to encode more information about where an error comes from, similar to HTTP status codes.
    • I think we should use 3 digits to encode the DOMAIN, for example:
      • 100: General server error
      • 101: Wallet error
      • 102: Validation error
      • 103: Funding error
      • 104: PSBT error
      • xxx: Define as needed, up to 999 domains possible.
    • Then we can use two more digits to encode the more specific CODE, which can be defined per domain. For example the PSBT domain could have the codes:
      • 00: Unspent input not found
      • 01: Funding error
      • yy: Define as needed, up to 99 codes per domain possible.
    • With this, the two errors described in this PR would get the full codes 10400 and 10401 respectively.
  2. To transport additional details in a structured manner, we add a simple lnrpc.ErrDetails message which is a map<string, string> details = 1; gRPC type. With that we could encode the index of the first error as details["index"] = fmt.Sprintf("%d", idx). That would still require the client to match the string name of the field, but that name should remain much more stable.

What do you think, @arshbot, @joostjager ?

@guggero guggero mentioned this pull request Aug 31, 2021
10 tasks
@alexbosworth
Copy link
Contributor

2. To transport additional details in a structured manner, we add a simple lnrpc.ErrDetails message which is a map<string, string> details = 1; gRPC type. With that we could encode the index of the first error as details["index"] = fmt.Sprintf("%d", idx). That would still require the client to match the string name of the field, but that name should remain much more stable.

If error details are a string, is that structured error details?

@guggero
Copy link
Collaborator

guggero commented Sep 1, 2021

If error details are a string, is that structured error details?

I would argue yes. You get key/value pairs with known and stable keys and you don't have to parse a string to get to them. Yes, you might need to convert the string value into a native data type. But at least this would be very generic and could still be shown as a human readable error string to the user.

But this is just my idea for making things more generic, I'm open to suggestions if you feel there's another way.

@joostjager
Copy link
Contributor

I think that ideally you don't have a generic error code, not even if it is per domain. Because with a generic code, you still need to go through the lnd source code to see what the exact error codes are that need to be handled for a specific call, or rely on documentation that can get desynced with the code relatively easy.

Using a specific proto error object gets you the strictest contract. It doesn't need to be one of the predefined google messages that is attached with WithDetails. Also it could be good enough to have a single object per call that is a union of all error attributes for that call. Then the type select isn't needed to distinguish between the various error cases.

I do see the point of ease of use. Just checking the top-level grpc code is very convenient.

@alexbosworth
Copy link
Contributor

But this is just my idea for making things more generic, I'm open to suggestions if you feel there's another way.

I think the existing model has its strength in surfacing the documentation of known common failure states to watch out for, but I agree the weakness is having unknown or uncommon errors return a result that is hard to rely on and also it's kind of unreasonable to expect enumeration of every possible error in the gRPC.

For common failures I like the proto structured failure responses, for uncommon errors I like the current model of strings explaining what happened but I think the proposed schema for a number system wouldn't help me much with knowing what to do in response to that error. The main weakness of the structured responses is that sometimes the real underlying failure doesn't really match up to what is reported in the structure, but that could be resolved by adding more enumerations or not being so strict with trying to return a structured error in unexpected cases.

In HTTP the error codes classes of errors also have prescriptions of what to do: in a 4xx class you generally need to fix your own problem and in a 5xx class you need to retry or wait for the server to fix their problem, and within those codes there are specific prescriptions for behavior. That pattern is hard to replicate here so I'm not sure it can be copied.

I'd definitely like to have a 'unique identifier' for an error though which is basically just replacing the string that I'm currently matching for with some value that the RPC says won't change and then the RPC would be more free to change the strings that I'm matching against for typos or adding context etc. It's especially difficult to do string matching when the error string is adding in contextual details, then I have to do regex matching.

@guggero
Copy link
Collaborator

guggero commented Nov 29, 2021

@arshbot any thoughts on the discussion above? Should we try to pick this up again for 0.15?
I'm going to remove my request for review until we know how we want to proceed with this.

@guggero guggero removed their request for review November 29, 2021 12:29
@Roasbeef Roasbeef removed this from the v0.15.0 milestone Feb 2, 2022
@Roasbeef Roasbeef added the up for grabs PRs which have been abandoned by their original authors and can be taken up by someone else label Feb 2, 2022
@ziggie1984
Copy link
Collaborator

I would take this issue and finish it, is this ok? I saw its up for grabs but asking anyway before starting.

@guggero
Copy link
Collaborator

guggero commented Feb 21, 2023

I would take this issue and finish it, is this ok? I saw its up for grabs but asking anyway before starting.

Yes, feel free to start working on this. But just a heads up, I don't think we actually came to a conclusion on how exactly we'd like to structure the error codes (see discussion above), so it's possible there might be quite a bit of back and forth during the review.
But given the large design space, it's probably easiest to just show a concrete example in code and take the discussion from there.

@ziggie1984
Copy link
Collaborator

not working on this currently, have found another more urgent issue for now, will come back in the future!

@lightninglabs-deploy
Copy link

@arshbot, remember to re-request review from reviewers when ready

@lightninglabs-deploy
Copy link

Closing due to inactivity

12 similar comments
@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

@lightninglabs-deploy
Copy link

Closing due to inactivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 might get fixed, nice to have up for grabs PRs which have been abandoned by their original authors and can be taken up by someone else
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants