`MountTmpfsAtTemp` doesn't seem to always have the desired effect

**Describe the bug**
`MountTmpfsAtTemp=false` doesn't always seem to take effect. A large proportion of instances come up with `tmpfs` still mounted to `/tmp`, and then get terminated.

This causes a lot of instance churn on scale-up.

I suspect there is something else running on startup that writes to the `/tmp` dir at around the same time that `bk-mount-instance-storage.sh` is run, blocking the unmount operation and causing it to fail. Is there anything in the AMI that might do that?

**note**: it's entirely possible that there's something in our custom AMI that we build atop of the buildkite-provided one that is responsible for this. If you can't reproduce this, that is still good info that we need to audit our startup processes


**Steps To Reproduce**
Steps to reproduce the behavior:
1. Create a elastic-ci stack on version `v6.21.0`
2. Set `MountTmpfsAtTemp` to false
3. wait for the Autoscaling Group to come online
4. observe some instances transition to a `InService` state, but get terminated after ~1 minute.

**Expected behavior**
when `MountTmpfsAtTemp` is set to false, the instance [runs](https://github.com/buildkite/elastic-ci-stack-for-aws/blob/main/packer/linux/conf/bin/bk-mount-instance-storage.sh#L33) `systemctl mask --now tmp.mount` on startup, which correctly unmounts tmpfs from the `/tmp` directory. We observe this in a small number of instances that come up:

```
[yuchuanyuan@ip-10-0-102-149 ~]$ sudo cat /var/log/elastic-stack.log 
Starting /usr/local/bin/bk-mount-instance-storage.sh...
Disabling automatic mount of tmpfs at /tmp
Created symlink /etc/systemd/system/tmp.mount → /dev/null.
Mounting instance storage...
No NVMe drives to mount.
Please check that your instance type supports instance storage.

<truncated>


[yuchuanyuan@ip-10-0-102-149 ~]$ systemctl status tmp.mount
○ tmp.mount
     Loaded: masked (Reason: Unit tmp.mount is masked.)
     Active: inactive (dead) since Tue 2024-06-11 20:02:24 UTC; 1h 18min ago
   Duration: 8.556s
        CPU: 7ms

Jun 11 20:02:24 ip-10-0-102-149.ec2.internal systemd[1]: Unmounting tmp.mount - /tmp...
Jun 11 20:02:24 ip-10-0-102-149.ec2.internal systemd[1]: tmp.mount: Deactivated successfully.
Jun 11 20:02:24 ip-10-0-102-149.ec2.internal systemd[1]: Unmounted tmp.mount - /tmp.
```

**Actual behaviour**
On a large number of instances, instead we see the following:
```
[yuchuanyuan@ip-10-0-114-131 ~]$ sudo cat /var/log/elastic-stack.log 
Starting /usr/local/bin/bk-mount-instance-storage.sh...
Disabling automatic mount of tmpfs at /tmp
Created symlink /etc/systemd/system/tmp.mount → /dev/null.
Job failed. See "journalctl -xe" for details.
/usr/local/bin/bk-mount-instance-storage.sh errored with exit code 1 on line 33.
Starting /usr/local/bin/bk-configure-docker.sh...
Sourcing /usr/local/lib/bk-configure-docker.sh...

<truncated>

[yuchuanyuan@ip-10-0-114-131 ~]$ systemctl status tmp.mount
● tmp.mount - /tmp
     Loaded: masked (Reason: Unit tmp.mount is masked.)
     Active: active (mounted) (Result: exit-code) since Tue 2024-06-11 20:35:25 UTC; 24s ago
      Where: /tmp
       What: tmpfs
      Tasks: 0 (limit: 9247)
     Memory: 44.0K
        CPU: 6ms
     CGroup: /system.slice/tmp.mount

Jun 11 20:35:25 ip-10-0-114-131.ec2.internal systemd[1]: Unmounting tmp.mount - /tmp...
Jun 11 20:35:25 ip-10-0-114-131.ec2.internal umount[1923]: umount: /tmp: target is busy.
Jun 11 20:35:25 ip-10-0-114-131.ec2.internal systemd[1]: tmp.mount: Mount process exited, code=exited, status=32/n/a
Jun 11 20:35:25 ip-10-0-114-131.ec2.internal systemd[1]: Failed unmounting tmp.mount - /tmp.
```



**Stack parameters:**
 - AWS Region:  us-east-1
 - Version:  v6.21.0
 - AMI: built atop `ami-04ca34320055d861c`
 - instance types: m5.large, m5a.large, m5a.xlarge, m6a.xlarge


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`MountTmpfsAtTemp` doesn't seem to always have the desired effect #1326

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MountTmpfsAtTemp doesn't seem to always have the desired effect #1326

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`MountTmpfsAtTemp` doesn't seem to always have the desired effect #1326