Skip to content

add the wsl connection plugin #9795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rgl
Copy link
Contributor

@rgl rgl commented Feb 22, 2025

SUMMARY

This adds the community.general.wsl connection plugin.

This allows Ansible to remotely manage a WSL Distribution by using SSH to connect to a Windows machine, then use wsl.exe to execute commands inside a WSL Distribution.

This is derived from the existing community.general.proxmox_pct_remote connection plugin.

ISSUE TYPE
  • New Plugin Pull Request
COMPONENT NAME

community.general.wsl

ADDITIONAL INFORMATION

Here's an simple example how to use this. The full example is at the branch at https://github.com/rgl/terraform-libvirt-ansible-windows-example/tree/wsl.

Example inventory:

all:
  children:
    wsl:
      hosts:
        example-wsl-ubuntu:
          ansible_host: 10.17.3.2 # the windows host machine.
          wsl_distribution: Ubuntu-24.04
          wsl_user: ubuntu # the user inside the wsl distribution which will run the command.
      vars:
        ansible_connection: community.general.wsl
        ansible_user: vagrant # the user on the windows host machine.
        ansible_password: vagrant # the password for the user on the windows host machine.

Example playbook:

- name: WSL Example
  hosts: wsl
  gather_facts: true
  become: true
  tasks:
    - name: Ping
      ansible.builtin.ping:
    - name: Id (with become false)
      become: false
      changed_when: false
      args:
        executable: /bin/bash
      ansible.builtin.shell: |
        exec 2>&1
        set -x
        echo "$0"
        pwd
        id
    - name: Id (with become true)
      changed_when: false
      args:
        executable: /bin/bash
      ansible.builtin.shell: |
        exec 2>&1
        set -x
        echo "$0"
        pwd
        id
    - name: Install vim
      ansible.builtin.apt:
        name: vim
        install_recommends: false
    - name: Reboot
      ansible.builtin.reboot:
        boot_time_command: systemctl show -p ActiveEnterTimestamp init.scope

@ansibullbot ansibullbot added connection connection plugin new_contributor Help guide this first time contributor plugins plugin (any type) labels Feb 22, 2025
@ansibullbot

This comment was marked as outdated.

@ansibullbot ansibullbot added ci_verified Push fixes to PR branch to re-run CI needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR labels Feb 22, 2025
@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from 69e5503 to 5ea8dfa Compare February 22, 2025 16:47
@ansibullbot ansibullbot removed the ci_verified Push fixes to PR branch to re-run CI label Feb 22, 2025
@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from 5ea8dfa to 02c2e8a Compare February 22, 2025 16:51
@ansibullbot ansibullbot removed the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR label Feb 22, 2025
Copy link
Collaborator

@russoz russoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @rgl thanks for your contribution!!

I have a couple of comments below. :-)

stderr = b''.join(chan.makefile_stderr('rb', bufsize))
returncode = chan.recv_exit_status()

# NB the full english error message is:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if that is not subject to localization settings. If the remote user, or the remote system, sets the language to be something else, like fr_FR or pt_BR, would the error message still be in English? I know this is something we usually need to consider in Linux, and given that WSL runs an Ubuntu (or other distro) on top of itself, it might be something to consider as well.

Copy link
Contributor Author

@rgl rgl Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it probably is subject to localization settings, thou, not sure how to handle it, please advise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should set the env vars LANGUAGE and LC_ALL to C. Just watch out if the command spits out "funny" chars (as in emojis or non-English letters) there might be a little bit of a challenge to adapt this to C.utf-8 or some other codepage. Some systems will complain about that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually possible over SSH to Windows?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - I have no good answer to that one other than: we should test it.

Comment on lines +712 to +770
with FileLock().lock_file(lockfile, dirname, self.get_option('lock_file_timeout')):
# just in case any were added recently

self.ssh.load_system_host_keys()
self.ssh._host_keys.update(self.ssh._system_host_keys)

# gather information about the current key file, so
# we can ensure the new file has the correct mode/owner

key_dir = os.path.dirname(self.keyfile)
if os.path.exists(self.keyfile):
key_stat = os.stat(self.keyfile)
mode = key_stat.st_mode & 0o777
uid = key_stat.st_uid
gid = key_stat.st_gid
else:
mode = 0o644
uid = os.getuid()
gid = os.getgid()

# Save the new keys to a temporary file and move it into place
# rather than rewriting the file. We set delete=False because
# the file will be moved into place rather than cleaned up.

with tempfile.NamedTemporaryFile(dir=key_dir, delete=False) as tmp_keyfile:
tmp_keyfile_name = tmp_keyfile.name
os.chmod(tmp_keyfile_name, mode)
os.chown(tmp_keyfile_name, uid, gid)
self._save_ssh_host_keys(tmp_keyfile_name)

os.rename(tmp_keyfile_name, self.keyfile)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My Concurrency-fu is a little bit rusty, but I think you might benefit from a finer granularity in the lock here.

I might be wrong, but it doesn't look like load_system_host_keys() touches the path referred by keyfile. In that case, you might perhaps lock the file just for the rename operation, instead of the entire process.

And now that I wrote that, I just remembered that there is an utility function in Ansible called... atomic_move, but it is not an utility function, it is a method from the AnsibleModule class :( anyways, it might be interesting taking a look and see if you can get some ideas out of it. :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems, this loads the entire known hosts file, adds the inventory hosts, then saves it to disk. Something that needs to be done in a single transaction, that is, its there to prevent multiple concurrent executions of this section of the code, which seems fine. Tho, I have no idea weather the same procedure is followed by other apps (e.g. by the ssh command), so it might not protect against those modifying the known hosts file at the same time.

Indeed, at a first glance, the last part, seems to implement something similar to AnsibleModule.atomic_move.

This was introduced by @mietzen in #8424.

@mietzen, maybe you can tell us more about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems, this loads the entire known hosts file, adds the inventory hosts, then saves it to disk. Something that needs to be done in a single transaction, that is, its there to prevent multiple concurrent executions of this section of the code, which seems fine.

Yes this is what it should do.

Tho, I have no idea weather the same procedure is followed by other apps (e.g. by the ssh command), so it might not protect against those modifying the known hosts file at the same time.

Essentially I just refactored the close method of paramiko_ssh.

Indeed, at a first glance, the last part, seems to implement something similar to AnsibleModule.atomic_move.

I didn't knew about AnsibleModule.atomic_move, at the time I wrote that code.

@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from 02c2e8a to d3bb34c Compare February 23, 2025 12:05
@ansibullbot

This comment was marked as outdated.

@ansibullbot ansibullbot added ci_verified Push fixes to PR branch to re-run CI needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR labels Feb 23, 2025
@rgl
Copy link
Contributor Author

rgl commented Feb 23, 2025

@russoz thank you for the review! I've added the suggested changes for the code that is wsl specific, the other ones, I think, should first be applied to the existing community.general.proxmox_pct_remote connection plugin from which this plugin is derived from. I'm also not sure how to share code among the two and the original paramiko ssh plugin (this one seems to have changes that were not propagated to proxmox_pct_remote, and vice-versa).

what do you think?

Copy link
Collaborator

@felixfontein felixfontein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! I'm wondering a bit whether this plugin would be more appropriate to be added to community.windows, mostly because community.windows does have test infrastructure with Windows remotes (community.general does not and does not plan to add any). (I don't know whether their Windows remotes do have WSL installed, though.)

CC @jborean93 @briantist since they know community.windows a lot better.

@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from d3bb34c to 923da85 Compare February 23, 2025 12:48
@ansibullbot
Copy link
Collaborator

The test ansible-test sanity --test yamllint [explain] failed with 1 error:

plugins/connection/wsl.py:267:1: unparsable-with-libyaml: expected a single document in the stream - but found another document

click here for bot help

@jborean93
Copy link
Contributor

jborean93 commented Feb 23, 2025

Thanks for your contribution! I'm wondering a bit whether this plugin would be more appropriate to be added to community.windows

This would not be accepted there for a few reasons:

  • We are trying to cut back on the community.windows collection and are not accepting any new contributions
  • The collection is designed to target Windows hosts, while this is Windows related the ultimate target is the WSL distribution and Windows is just the middle man
  • Testing would be a nightmare to setup WSL on the CI hosts and configuring the distributions

Maybe it would be better to look into proxy command support to proxy the initial Windows SSH connection to the WSL instance and take advantage of the builtin ssh connection plugin.

Personally I'm not sure it should even be in this collection and is better off suited towards keeping it in your own but I'll leave that for the maintainers here to decide.

@russoz
Copy link
Collaborator

russoz commented Feb 23, 2025

I am far from being knowledgeable in Windows so forgive me if I'm saying something silly here. But thinking of this again, I cannot help thinking: it looks like the plugin connects to a sshd daemon that is NOT running in the WSL, hence it needs to connect, starts the WSL and then run whatever it is that needs running in there.

Wouldn't it be easier to start the ssh service within the WSL (which is an Ubuntu or some other distro) directly? Then it would be a plain old Linux to Linux ssh connection. A quick googling found some posts indicating that is possible but not entirely straightforward, so YMMV.

@felixfontein
Copy link
Collaborator

Wouldn't it be easier to start the ssh service within the WSL (which is an Ubuntu or some other distro) directly? Then it would be a plain old Linux to Linux ssh connection. A quick googling found some posts indicating that is possible but not entirely straightforward, so YMMV.

It would be even simpler to install Linux right away instead of using WSL in Windows. But in most cases there are some external requirements not under your control that force you to use a specific setup, and I guess there will be some folks who need SSH to run under Windows itself, outside WSL, and who'd like Ansible to connect to WSL inside their Windows. They will have to jump through this extra hoop.

From that POV I'm ok with merging this here, since this seems like something that is useful for parts of the community.

@rgl
Copy link
Contributor Author

rgl commented Feb 24, 2025

I also tried the install open ssh daemon inside the WSl distribution, but that has more moving parts, and has a major caveat for me, the WSl dist shutdowns after a while (although you can configure it, which is something more to look for), and if it is shutdown, you cannot access the sshd to manage it. The only thing that is always up, is the windows host itself, hence, this PR.

If the CI host has all the windows features for WSl requirements (no idea if that works in a github hosted runner; maybe not, as it requires nested virtualization), creating a Ubuntu dist is quite straightforward (thou, it requires downloading a large rootfs tarball, when not cached somewhere).

@felixfontein
Copy link
Collaborator

Regarding the failing sanity checks: you need to add

plugins/connection/wsl.py yamllint:unparsable-with-libyaml

to tests/sanity/ignore-2.15.txt and tests/sanity/ignore-2.16.txt (not to the others!). Please add it in the lexicographically correct position.

@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from 923da85 to 76acdd0 Compare February 24, 2025 22:28
@ansibullbot ansibullbot added tests tests and removed ci_verified Push fixes to PR branch to re-run CI needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR labels Feb 24, 2025
@russoz
Copy link
Collaborator

russoz commented Mar 1, 2025

Hi @rgl thanks for the updates made so far. Please note that there are still a number of comments in the code and, there should be some testing associated with new plugins. Writing integration tests for connection plugins is certainly a harder thing to do, so I would suggest you take a look at the existing tests for other connection plugins and try and mimic/adapt their logic to your plugin.

@ansibullbot
Copy link
Collaborator

@ansibullbot ansibullbot added the module module label Apr 14, 2025
@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from 3100463 to a3634bd Compare April 14, 2025 20:29
@felixfontein
Copy link
Collaborator

For some reason the diff for this PR now contains two changes that are already part of main. Maybe something went wrong during rebasing?

@ansibullbot ansibullbot removed the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR label Apr 14, 2025
@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from a3634bd to 07e24ed Compare April 14, 2025 20:49
@rgl
Copy link
Contributor Author

rgl commented Apr 14, 2025

I've rebased this branch over the current main branch.

@ansibullbot ansibullbot removed the module module label Apr 14, 2025
@felixfontein
Copy link
Collaborator

I think this looks good enough for a first version. If nobody objects, I'll merge this on the upcoming weekend.

Copy link
Collaborator

@russoz russoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @rgl

A couple of things to tend to, but we're getting there. :-)

stderr = b''.join(chan.makefile_stderr('rb', bufsize))
returncode = chan.recv_exit_status()

# NB the full english error message is:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - I have no good answer to that one other than: we should test it.

@rgl rgl force-pushed the rgl-add-wsl-connection-plugin branch from 07e24ed to 92b6219 Compare April 17, 2025 17:32
Copy link
Collaborator

@russoz russoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@felixfontein felixfontein added backport-10 Automatically create a backport for the stable-10 branch and removed check-before-release PR will be looked at again shortly before release and merged if possible. labels Apr 19, 2025
@felixfontein felixfontein merged commit 96b4930 into ansible-collections:main Apr 19, 2025
141 checks passed
Copy link

patchback bot commented Apr 19, 2025

Backport to stable-10: 💚 backport PR created

✅ Backport PR branch: patchback/backports/stable-10/96b493002ceb892cf05a04a99bac84b2ad947432/pr-9795

Backported as #10019

🤖 @patchback
I'm built with octomachinery and
my source is open — https://github.com/sanitizers/patchback-github-app.

patchback bot pushed a commit that referenced this pull request Apr 19, 2025
* add the wsl connection plugin

* move the banner_timeout required paramiko version to its own line

* document the proxy_command required paramiko version

* document the timeout required paramiko version

* simplify the sending of the become_pass value

* add Connection.__init__ type hints

* add MyAddPolicy.missing_host_key type hints

* normalize the Connection._parse_proxy_command replacers dict values to the str type

* add the user_known_hosts_file option

* modify the private_key_file option type to path

(cherry picked from commit 96b4930)
@felixfontein
Copy link
Collaborator

@rgl thanks for your contribution!
@russoz @mietzen @jborean93 thanks for your comments and reviews!

felixfontein pushed a commit that referenced this pull request Apr 19, 2025
…#10019)

add the wsl connection plugin (#9795)

* add the wsl connection plugin

* move the banner_timeout required paramiko version to its own line

* document the proxy_command required paramiko version

* document the timeout required paramiko version

* simplify the sending of the become_pass value

* add Connection.__init__ type hints

* add MyAddPolicy.missing_host_key type hints

* normalize the Connection._parse_proxy_command replacers dict values to the str type

* add the user_known_hosts_file option

* modify the private_key_file option type to path

(cherry picked from commit 96b4930)

Co-authored-by: Rui Lopes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-10 Automatically create a backport for the stable-10 branch connection connection plugin integration tests/integration new_contributor Help guide this first time contributor new_plugin New plugin plugins plugin (any type) tests tests unit tests/unit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants