-
Notifications
You must be signed in to change notification settings - Fork 204
[ipc/listener][windows] Implement retry mechanism for listener creation #10202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
|
This pull request does not have a backport label. Could you fix it @VihasMakwana? 🙏
|
|
I don't think this will fix https://github.com/elastic/ingest-dev/issues/6133 That error is: That error is coming from filebeatreceiver, not elastic-agent. I think we need to do the retry in the libbeat api server: |
Oh! |
Thank you for letting me know, Lee! |
|
💚 Build Succeeded
History
|
|
|
||
| // CreateListener creates net listener from address string | ||
| // Shared for control and beats comms sockets | ||
| func CreateListener(log *logger.Logger, address string) (net.Listener, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this change this is really going to have to take a context. Which I know is not great because changing the function signature means it needs to be updated everywhere. But if this is going to take up to 5 seconds, context cancelling will need to be handled. The user might Ctrl-C the elastic-agent it should not be stuck in this section and wait up to 5 seconds before it returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@blakerouse what if we create a signals.Notify(..) here and exit on sigint or sigterm, instead of passing context all the way from command?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/search?q=repo%3Aelastic%2Felastic-agent%20CreateListener&type=code
It looks to me like there are only 3 calls to this, two of which have a context already. The one that doesn't could just be given context.TODO() without making anything worse. Listening to signals in more than one place would be way messier IMO and possibly complicate the problems we are already having with the Windows service lifetime being managed properly.
elastic-agent/pkg/component/runtime/conn_info_server.go
Lines 31 to 38 in 278440a
| func newConnInfoServer(log *logger.Logger, comm Communicator, address string) (*connInfoServer, error) { | |
| var ( | |
| listener net.Listener | |
| err error | |
| ) | |
| if ipc.IsLocal(address) { | |
| listener, err = ipc.CreateListener(log, address) |




What does this PR do?
Adds a retry mechanism when creating a named pipe listener on Windows.
Instead of failing immediately if listener creation fails, it now retries for up to 5 seconds with a short delay between attempts.
Why is it important?
Sometimes the named pipe might not be immediately available. This was observed when using monitoring with beat receivers.
Checklist
./changelog/fragmentsusing the changelog tool