Skip to content

ssh -f seems to create race condition #3

Open
@ulidtko

Description

Hi Eddie!

One of my teammates has picked your orb to run integration testing against an internal DB instance, using tunneling through a bastion host. Thanks for publishing BTW, it's pretty useful to trim down the mess that our 500-line .circleci/config.yml is!

However, there's a recurring issue where the next step after dmz/open_tunnel fails with Connection Refused to the tunneled port. Most of the time, of course, it works fine — and my usual interpretation of such symptom is that there is connection-opening race present.

Meaning that ssh -f (I guess!) will listen the local socket very soon, but still after forking; creating a small window of time when the ssh -f has returned control to bash but the socket isn't listened yet. Occasionally, the OS scheduler will starve ssh and resume the bash script instead (which naturally assumes that the tunnel port is already open) — and hit ECONNREFUSED.

I see that in the usage examples, you always add a curl localhost to maybe keep track of the problem. It's also easy to work around such issues by adding explicit polling of the port (effectively synchronizing away the race between ssh and bash).

I'd appreciate if you have any comment on this. Maybe something as simple as adding:

while test 7 -eq $(curl -s localhost:1234; echo $?); do sleep 0.1; done

after the ssh -Nf call in the orb source. What do you think?

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions