Skip to content

[v0.1.8] Issue running the command mila init #176

@MatthewWiens101

Description

@MatthewWiens101

Steps to reproduce

conda create -y -n mila python=3.9
conda activate mila
pip install milatools --upgrade

What command did you run?

mila init

Describe the bug

Checking ssh config
Fixed the permissions on ssh directory at <omit>\.ssh to 700
Fixing permissions on <omit>\.ssh\config to 600
? Do you also have an account on the ComputeCanada/DRAC clusters? No
Did not change ssh config
Checking passwordless authentication
Checking if passwordless SSH access is setup for the mila cluster.
identity_file ~/.ssh/id_rsa
Traceback (most recent call last):
  File "<omit>\Anaconda3\envs\mila\lib\site-packages\milatools\cli\commands.py", line 89, in main
    mila()
  File "<omit>\Anaconda3\envs\mila\lib\site-packages\milatools\cli\commands.py", line 145, in mila
    return function(**args_dict)
  File "<omit>\Anaconda3\envs\mila\lib\site-packages\milatools\cli\commands.py", line 522, in init
    success = setup_passwordless_ssh_access(ssh_config=ssh_config)
  File "<omit>\Anaconda3\envs\mila\lib\site-packages\milatools\cli\init_command.py", line 267, in setup_passwordless_ssh_access
    success = setup_passwordless_ssh_access_to_cluster("mila")
  File "<omit>\Anaconda3\envs\mila\lib\site-packages\milatools\cli\init_command.py", line 326, in setup_passwordless_ssh_access_to_cluster
    assert ssh_public_key_path.exists()
AssertionError

An error occurred during the execution of the command `init`. Please try updating milatools by running
  pip install milatools --upgrade
in the terminal. If the issue persists, consider filling a bug report at
  https://github.com/mila-iqia/milatools/issues/new?labels=init%2C0.1.8&template=bug_report.md&title=%5Bv0.1.8%5D+Issue+running+the+command+%60mila+init%60
Please provide the error traceback with the report (the red text above).

Desktop (please complete the following information):

  • OS: Windows 11
  • Shell: Powershell

Additional context

Appears to be identical to #108, #140, #155, #167, #169. A quick look at the code shows that the steps to find the ssh key make several assumptions:

    config = SSHConfig.from_path(str(SSH_CONFIG_FILE))
    identity_file = config.lookup(cluster).get("identityfile", "~/.ssh/id_rsa")
    # Seems to be a list for some reason?
    if isinstance(identity_file, list):
        assert identity_file
        identity_file = identity_file[0]
    ssh_private_key_path = Path(identity_file).expanduser()
    ssh_public_key_path = ssh_private_key_path.with_suffix(".pub")
    assert ssh_public_key_path.exists()

Firstly, it looks for an identityfile entry is looked for within the ssh config file, even though I don't believe there are prior steps to set this up so this is just a check. Secondly, if it fails, it assumes a file named ~/.ssh/id_rsa not only already exists, but is also the correct key for mila. In my case, both of these assumptions were incorrect.

Proposed solution

I recommend adding a section to setup_ssh_config which prompts the user for the path to their private key file and checks that the private and public key exists, then adds this information to the ssh config file for relevant hosts. Alternatively, since an incorrect key may be provided and added to the ssh config file, it may make more sense to have the user specify the key location in setup_passwordless_ssh_access_to_cluster, confirm that key exists and works properly, and then add it to the ssh config file after the fact.

Temporary Solution

For users encountering this error, I was able to bypass it by adding a line to my .ssh/config file under the mila host entry generated by milatools using the following format:

Host mila
    ...
    IdentityFile <path-to-private-key-here>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions