Skip to content

Failure to attach a new public IP address with deploy --check after stop #1010

Open
@amemni

Description

@amemni

Steps to reproduce:
Having a deployment with a basic network specification like this:

{
 ...
}:
{
  machine = { config, pkgs, lib, ... }:
  {
    deployment.targetEnv = "gce";
    deployment.gce.project = "xxx";
    deployment.gce.serviceAccount = "[email protected]";
    deployment.gce.accessKey = "/home/deploy-ops/global_creds/xxx/gce.pem";
    deployment.gce.network = "xxx";
    deployment.gce.region = "us-east1-b";
    deployment.gce.subnet = "xxx-us-east1";
    deployment.gce.tags = [ "xxx" ];
    deployment.gce.instanceType = "n1-standard-1";
    deployment.gce.rootDiskSize = 100;
    deployment.gce.scheduling.preemptible = true;
    deployment.gce.scheduling.automaticRestart = false;
    deployment.gce.scheduling.onHostMaintenance = "TERMINATE";
  };
}

, then doing a stop and a deploy --check:

$ nixops stop -d gce-testing-2
machine..> stopping GCE machine... stopped

, it results into this behavior:

$ nixops deploy -d gce-testing-2 --check
bootstrap> warning: GCE image 'n-86682f85be6c11e886880a29f7962a18-bootstrap' description has changed to ''; expected it to be 'None'; cannot fix this automatically
machine..> warning: GCE machine 'n-86682f85be6c11e886880a29f7962a18-machine' preemptible has changed to 'True'; expected it to be 'None'; cannot fix this automatically
machine..> warning: GCE machine 'n-86682f85be6c11e886880a29f7962a18-machine' public IP address has changed to 'None'; expected it to be '35.231.68.238'
machine..> attaching public IP address [Ephemeral]
error: {'domain': 'global', 'message': 'At most one access config currently supported.', 'reason': 'badRequest'}

This failure seems to be cause by the previous access config not being detached first from the instance. Manually detaching it:

amemni:~$ gcloud compute instances delete-access-config n-86682f85be6c11e886880a29f7962a18-machine --access-config-name "External NAT" --project xxx
Updated [https://www.googleapis.com/compute/v1/projects/xxx/zones/us-east1-b/instances/n-86682f85be6c11e886880a29f7962a18-machine].

, it results into this:

$  nixops deploy -d gce-testing-2 --check
bootstrap> warning: GCE image 'n-86682f85be6c11e886880a29f7962a18-bootstrap' description has changed to ''; expected it to be 'None'; cannot fix this automatically
machine..> warning: GCE machine 'n-86682f85be6c11e886880a29f7962a18-machine' preemptible has changed to 'True'; expected it to be 'None'; cannot fix this automatically
machine..> attaching public IP address [Ephemeral]
machine..> got public IP: None
error: unsupported operand type(s) for +: 'NoneType' and 'str'

Environment details:
Nixops version: 1.6.1
Nix version: 2.1
NixOs/Nixpkgs version: 17.09

I believe the discrepency might be the fact that self.public_ipv4 is being nulled maybe because the self.connect().create_node(..) is not executed again (recreate = False) thus this bloc is not executed:

513             self.log("detaching old public IP address {0}".format(self.public_ipv4))
514             self.connect().connection.async_request(
515                 "/zones/{0}/instances/{1}/deleteAccessConfig?accessConfig=External+NAT&networkInterface=nic0"
516                 .format(self.region, self.machine_name), method = 'POST')
517             self.public_ipv4 = None
518             self.ipAddress = None

Also having self.public_ipv4 set to None breaks things here:

532             self.log("got public IP: {0}".format(self.public_ipv4))
533             known_hosts.add(self.public_ipv4, self.public_host_key)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions