Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: deprecate cli.StatusError direct field usage #5666

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Benehiko
Copy link
Member

@Benehiko Benehiko commented Dec 3, 2024

The exported cli.StatusError type may be used
by some directly within their own projects, making it difficult to update the struct's fields.

This patch converts the exported cli.StatusError to an interface instead, so that code wrapping the CLI would still be able to match the error and get the status code without exposing the fields.

This is a breaking change for those relying on creating a cli.StatusError{} and accessing the error's fields. For those using errors.As(err, &statusError) and err.(cli.StatusError) will be able to continue using it without breakage.

Users accessing the fields of cli.StatusError{}.StatusCode would need to use the new GetStatusCode() method.

- What I did

- How I did it

- How to verify it

- Description for the changelog

External code importing and using `cli.StatusError` directly (e.g. creating an instance of it) is deprecated.

- A picture of a cute animal (not mandatory but encouraged)

@codecov-commenter
Copy link

codecov-commenter commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 43.93939% with 37 lines in your changes missing coverage. Please review.

Project coverage is 59.47%. Comparing base (41c2786) to head (b1f899c).
Report is 10 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5666      +/-   ##
==========================================
+ Coverage   59.42%   59.47%   +0.04%     
==========================================
  Files         347      348       +1     
  Lines       29402    29416      +14     
==========================================
+ Hits        17472    17495      +23     
+ Misses      10958    10948      -10     
- Partials      972      973       +1     

cli/error.go Outdated Show resolved Hide resolved
cmd/docker/docker.go Outdated Show resolved Hide resolved
cli/error.go Outdated Show resolved Hide resolved
@Benehiko
Copy link
Member Author

Benehiko commented Jan 3, 2025

@thaJeztah @vvoland do you have any concerns that would prevent this PR from getting merged?

@krissetto
Copy link
Contributor

krissetto commented Jan 7, 2025

I'm also not sure if we should be changing the StatusError.. Rather than leaving the whole PR sitting while we try to decide, i think we can split that out and at least merge the ctx cancellation bits which are good to have and improve the ux. WDYT?

@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

I'm also not sure if we should be changing the StatusError.. Rather than leaving the whole PR sitting while we try to decide, i think we can split that out and at least merge the ctx cancellation bits which are good to have and improve the ux. WDYT?

We can separate it (it already is split up between 2 commits), but the UI will still print the error "user terminated the process" since the StatusError struct only accepts a string and not the actual error + children, so it cannot be used inside errors.Is since it won't be able to Unwrap the error exposing the context cancellation error.

For example:

⋊> ~/G/cli-3 on 5f221783c  ./build/docker run postgres:latest                                                                                                                              12:49:47
Unable to find image 'postgres:latest' locally
latest: Pulling from library/postgres
fd674058ff8f: Pulling fs layer
1eab12a50bdf: Pulling fs layer
5a81b4aedb94: Pulling fs layer
502eeeb4a17b: Waiting
e9e19177b318: Waiting
2068838cf5fa: Waiting
45a271dbb114: Waiting
8f9ac4ec849d: Waiting
9d8b60e88ddb: Waiting
3ec4ef471804: Waiting
16d755b48cd4: Waiting
3d5d11fb541c: Waiting
d8ab5fe30360: Waiting
d19370fe7a12: Waiting
^Cdocker: user terminated the process

Run 'docker run --help' for more information
⋊> ~/G/cli-3 on 5f221783c

The only option to not have StatusError changed and not have the output string is to just not use StatusError at all and return a different error...

You can try it for yourself, git checkout 5f221783c2a5791ec34e4070353a3125fd0847c9, make sure you don't have postgres:latest (or the image you are testing) on your machine, then run docker run postgres:latest. Cancel the pull mid way with ctrl+c and you will see the error message as shown above.

@laurazard
Copy link
Collaborator

laurazard commented Jan 7, 2025

There are other options that aren't perfect, but work in the interim, such as @thaJeztah's solution here. Would require a pass over the codebase to look at all of the other places, but that's probably fine and something that can be done iteratively.

There's always string matching too 😅
instead of

	...
	if err != nil && !errdefs.IsCancelled(err) && !errors.Is(err, errCtxUserTerminated) {
	...

do

	...
	if err != nil {
		// FIXME: replace this with errdefs.IsCancelled after changing StatusErr
		if err.Error() != "context cancelled" {
		...

@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

There are other options that aren't perfect, but work in the interim, such as @thaJeztah's solution here. Would require a pass over the codebase to look at all of the other places, but that's probably fine and something that can be done iteratively.

There's always string matching too 😅 instead of

	...
	if err != nil && !errdefs.IsCancelled(err) && !errors.Is(err, errCtxUserTerminated) {
	...

do

	...
	if err != nil {
		// FIXME: replace this with errdefs.IsCancelled after changing StatusErr
		if err.Error() != "context cancelled" {
		...

I also wrestled with this a bit and I still came to the conclusion that having it match with errors.Is is more robust + we get this feature throughout the whole CLI instead of iteratively changing things in specific circumstances, which could lead some code paths to output the error while other do not.

The "best" solution I can come up with at this point in time is to replace StatusError in the places we use it with an internal version of it, then we don't break anything to external consumers of the CLI code and the behavior of the CLI improves without the risk of some commands being "left behind".

@Benehiko Benehiko force-pushed the ctx-cancel-status branch 3 times, most recently from f9c3ab4 to bed4155 Compare January 7, 2025 13:15
@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

I've updated the code to use a new error called internal.StatusError instead of relying on cli.StatusError. This keeps the original cli.StatusError intact.

Now the question is:

  • If the user is calling code inside the CLI where previously cli.StatusError was returned and then comparing the error using errors.As(err, &cli.StatusError) then it will silently fail (not match on runtime, but compile fine).
  • If the user is using cli.StatusError inside their own code it will compile and behave the same.

@laurazard
Copy link
Collaborator

laurazard commented Jan 7, 2025

If the user is calling code inside the CLI where previously cli.StatusError was returned and then comparing the error using errors.As(err, &cli.StatusError) then it will silently fail (not match on runtime, but compile fine).

Yeah, I see a few instances (~250, but could be more if importing under other names) – including @ndeloof in swarmctl – of people doing this around Github, but that's a lot smaller so I'm less concerned. I'll leave it up to @thaJeztah and folks as to how acceptable that is.

Perfect is the enemy of good, and I think in projects such as these compromises need to be made sometimes to make the current situation better without breaking other things, which is why I don't think string matching as a stop-gap (that can be reverted on the next major when StatusError is changed) isn't that bad. As it is, internal.StatusError is similarly a stop-gap measure that will get removed when no longer needed, except it introduces some breakage that string matching doesn't.

As long as StatusError gets changed in the next major, no commands can be "left behind", since we'd be changing the StatusError struct itself, which forces any code referencing it to change accordingly in order to compile. In actuality, the current measure also leaves that risk – if a command is forgotten about/not updated to use the internal version of StatusError, it'll get "left behind".

@Benehiko
Copy link
Member Author

Benehiko commented Jan 7, 2025

Yeah I mean, there are only three options here (I'm okay choosing any of these):

  1. We print the error (no breakage to cli.StatusError), but the error now reads "user terminated the process" instead of "context cancelled"
  2. We stick with breaking the cli.StatusError by updating the properties (compile-time error for those setting their own errors inside cli.StatusError{})
  3. We introduce internal.StatusError with the updated changes and the code inside the CLI always references the new error type, but it silently breaks comparisons cli.StatusError != internal.StatusError.

Currently the last commit introduces the third option, but we can drop it and just go with option 1.

@laurazard
Copy link
Collaborator

FYI @Benehiko that area/context label is usually used for contexts as in docker context, not go context.Contexts.

@laurazard
Copy link
Collaborator

Yeah I mean, there are only three options here (I'm okay choosing any of these):

  1. We print the error (no breakage to cli.StatusError), but the error now reads "user terminated the process" instead of "context cancelled"
  2. We stick with breaking the cli.StatusError by updating the properties (compile-time error for those setting their own errors inside cli.StatusError{})
  3. We introduce internal.StatusError with the updated changes and the code inside the CLI always references the new error type, but it silently breaks comparisons cli.StatusError != internal.StatusError.

To be clear, the best solution is clearly changing cli.StatusError and we want to do that on the next major (v28), all we're discussing is how make things better until then.

We could simply add a field to the cli.StatusError struct without breaking compatibility, as long as we don't remove the current field. Something like

// StatusError reports an unsuccessful exit by a command.
type StatusError struct {
	Cause      error
	// Deprecated: use Cause instead.
	Status     string
	StatusCode int
}

And use that in the CLI codebase from now on, and then remove the Status field on the next major.

@Benehiko Benehiko changed the title cli/command: ctx cancel should not print or produce a non zero exit code cli/command: ctx cancel should contain a more specific cause Jan 13, 2025
@Benehiko Benehiko changed the title cli/command: ctx cancel should contain a more specific cause cmd/docker: add cause to user-terminated context.Context Jan 13, 2025
@Benehiko
Copy link
Member Author

I split the PR. This PR has been updated to only include the "cause" of the context.Context cancellation. The other PR focuses on the cli.StatusError deprecation/changes #5738

@Benehiko Benehiko requested a review from a team January 16, 2025 17:03
@Benehiko Benehiko changed the title cmd/docker: add cause to user-terminated context.Context cli: deprecate cli.StatusError direct field usage Jan 20, 2025
@Benehiko Benehiko added this to the 28.0.0 milestone Jan 20, 2025
The exported `cli.StatusError` type may be used
by some directly within their own projects, making
it difficult to update the struct's fields.

This patch converts the exported `cli.StatusError` to
an interface instead, so that code wrapping the CLI
would still be able to match the error and get the
status code without exposing the fields.

This is a breaking change for those relying on creating
a `cli.StatusError{}` and accessing the error's fields.
For those using `errors.As(err, &statusError)` and
`err.(cli.StatusError)` will be able to continue using
it without breakage.

Users accessing the fields of `cli.StatusError{}.StatusCode`
would need to use the new `GetStatusCode()` method.

Signed-off-by: Alano Terblanche <[email protected]>
Copy link
Contributor

@ndeloof ndeloof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will prevent CLI plugins to manage exit code from docker CLI - which is required by compose --exit-code-from
Apart avoiding types in public API vs interfaces, which is indeed a better approach, what's the issue you're trying to fix with this breaking change ?

if errors.As(err, &stErr) {
_, _ = fmt.Fprintln(dockerCli.Err(), stErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you make StatusError internal, a CLI plugin won't be able to create such an error, and control the exit code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, as an interface it would force the plugin to implement their own error type and not rely on the struct. 🤔 - I'm wondering though if this is what we should do since plugins are essentially being forced to use the struct directly which to me seems bad as it makes it more difficult for the CLI to make any changes in the longer term and even more difficult knowing what contracts we actually have that we should respect.

The main case was to change the cli.StatusError{}.Status field to be of type error instead of type string. The reason being we want to be able to use errors.Is to match children of this error which is impossible with a string type for Status.

The second reason is to convert this to an interface so that third parties don't depend on this error type directly in their code. We are going to break third parties by changing cli.StatusError{}.Status -> cli.StatusError{}.Cause in any case so I took the opportunity to try and do it in one go instead of having to go through multiple such breaking changes in the future.

package main

import (
	"errors"
	"fmt"
)

type StatusError struct {
	Status error
	Code   int
}

func (s StatusError) Error() string {
	return s.Status.Error()
}

// NEED this to get the child
func (s StatusError) Unwrap() error {
	return s.Status
}

type barError struct{}

func (b barError) Error() string {
	return "I am bar error"
}

func bar() error {
	return barError{}
}

func foo() error {
	err := bar()
	return StatusError{
		Status: fmt.Errorf("I am wrapping this error: %w", err),
		Code:   125,
	}
}

func main() {
	err := foo()
	if errors.Is(err, barError{}) {
		fmt.Printf("Is bar error: %s\n", err)
	}
}

https://go.dev/play/p/eQi-908JZhR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just add a new Cause attribute to StatusError and implement Unwrap() error ? Doing so you can use errors.Is as long as cause has been set, which would have lower impact - as this change requires major changes anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this option can be pursued which would make things easier in the near term.

My goal was to try to create a clear boundary for third parties to use instead - whilst we get to do a major release (v28). Since we didn't clearly define what "internal" code is in the past, we now have this situation where changing the CLI code can come with a lot of unknown pitfalls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could say I'm trying to kill two birds with one stone.

  1. match child error types
  2. reduce technical debt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood, but impact is huge for our ecosystem.
keeping code clean is easier when you have no users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll update the code. Perhaps we could do a slow transition to an exported interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ndeloof i pushed some changes in a separate commit, let me know what you think

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my preference goes for #5778, as lower impact and simplest solution for the need

This patch ensures that plugins can still use `cli.StatusError`
and won't need to modify anything in the short term.

In the longer term we should deprecate `cli.StatusError{}.Status`
and instead only have `cli.StatusError{}.Cause`.

A common interface is introduced for both `cli.StatusError` and
`internal.StatusError` so that we can still use `errors.As`
no matter which error is returned.

Signed-off-by: Alano Terblanche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants