Symptom
Every 12h, WordPress sends a "Background Update Failed" email like:
```
FAILED: WordPress failed to update to WordPress 7.0-RC2-62287
Updating to WordPress 7.0-RC2-62287
...
Copying the required files...
Could not copy file. twentytwentyfive/package-lock.json
Installation failed. twentytwentyfive/package-lock.json
Error: [copy_failed_copy_dir_themes] Could not copy file.
twentytwentyfive/package-lock.json
```
On extrachill.com (running on the wp-coding-agents harness), this has been a recurring problem and wasn't tied to the known root-ownership symptom from #93 — file ownership was actually correct (`opencode:www-data`, with `www-data` in the `opencode` supplementary group). Yet the auto-update path running as `www-data` couldn't overwrite the file.
Root cause
The harness sets up the cooperative-write convention (opencode in www-data group, `chmod -R g+w` at setup) but does not enforce a group-friendly umask on the long-running services that write into `$SITE_PATH`.
`wp-config.php` declares the standard:
```php
define('FS_CHMOD_DIR', (0775 & ~ umask()));
define('FS_CHMOD_FILE', (0664 & ~ umask()));
```
But `umask()` is read from the running process. On a stock Debian/Ubuntu install, systemd services run with umask `0022`, so:
- `FS_CHMOD_FILE` becomes `0664 & ~022 = 0644` (group write stripped)
- `FS_CHMOD_DIR` becomes `0775 & ~022 = 0755` (group write stripped)
Net effect: every file PHP-FPM (or a coding-agent process) creates lands at mode 0644, silently undoing the `g+w` the harness applied at setup. Once a file is 0644 and owned by `opencode`, the auto-update path (running as `www-data` via `wp-cron.php` over HTTPS) gets `EPERM` even though it's in the group — the group bit isn't writable.
WordPress auto-update is just the canary. Anything that flips between www-data and opencode as the writer (cron, wp-cli from the kimaki agent, plugin-installer, etc.) eventually hits this.
Diagnostic on extrachill.com
```
$ ls -la /var/www/extrachill.com/wp-content/themes/twentytwentyfive/package-lock.json
-rw-r--r-- 1 opencode www-data 54180 May 1 19:05 package-lock.json
^^^ owner can write, group CANNOT, even though group is www-data and we want it to
$ id www-data
uid=33(www-data) gid=33(www-data) groups=33(www-data)
$ id opencode
uid=1000(opencode) gid=1000(opencode) groups=1000(opencode),4(adm),33(www-data)
$ getent group www-data
www-data:x:33:opencode
$ cat /proc/$(pgrep -f "php-fpm: pool" | head -1)/status | grep -i umask
Umask: 0022
$ find /var/www/extrachill.com/wp-content/themes -type f ! -perm -g+w | wc -l
1508
$ find /var/www/extrachill.com/wp-content/plugins -type f ! -perm -g+w | wc -l
534
```
So 2042+ files are mode 0644 in dirs that the harness explicitly told setup to keep group-writable.
What's missing in the harness
`lib/infrastructure.sh` does:
```
useradd -m -s /bin/bash -G www-data "$SERVICE_USER"
chmod -R g+w "$SITE_PATH"
chown -R www-data:www-data "$SITE_PATH"
```
…but never:
- Sets `UMask=0002` on the php-fpm systemd unit (so files PHP-FPM creates inherit `0664` / `0775`).
- Sets `UMask=0002` on the kimaki systemd unit (so any agent-spawned write — wp-cli, MCP, file abilities — also inherits `0664` / `0775`).
Without (1), WP auto-updates and any plugin that writes during a web request silently produce 0644 files.
Without (2), every wp-cli or filesystem write from a coding-agent session does the same.
Proposed fix
For VPS installs (the scope where the harness manages systemd):
-
`bridges/kimaki.sh` (kimaki systemd unit template): add `UMask=0002` to `[Service]`.
-
`lib/infrastructure.sh` (PHP-FPM provisioning): drop a `/etc/systemd/system/php${PHP_VERSION}-fpm.service.d/umask.conf` with:
```
[Service]
UMask=0002
```
Then `systemctl daemon-reload && systemctl restart php${PHP_VERSION}-fpm`.
-
After applying the drop-in, the harness should run a one-time `find $SITE_PATH -type f ! -perm -g+w -exec chmod g+w {} +` and the equivalent for dirs, to repair any files already stamped at 0644 by the previous umask-0022 runtime. Otherwise the next auto-update still fails once before the umask change starts paying off.
`upgrade.sh` should idempotently apply the drop-in and the repair pass via `_smart_update_systemd_unit`-style logic so existing installs pick up the fix without re-running setup.
Why this is harness-level, not wp-config / per-site
- `wp-config.php` already declares the canonical `0664/0775` constants. The harness owns the runtime where those constants get masked, so the harness owns the umask.
- Asking site owners to chmod manually after every auto-update is the workaround we keep doing and it's exactly what RULES.md says not to do.
Related
Acceptance criteria
- Fresh `setup.sh` on a clean VPS produces a php-fpm service with `Umask: 0002` (verifiable via `/proc/$(pgrep php-fpm | head -1)/status`).
- A file written by PHP-FPM into `$SITE_PATH` lands at mode 0664 with group www-data, group-writable.
- WordPress auto-update from one nightly to the next succeeds on a site whose theme files were last touched by `opencode`.
- `upgrade.sh` on an existing install applies the drop-in without re-running full setup, and runs the one-time repair pass.
Symptom
Every 12h, WordPress sends a "Background Update Failed" email like:
```
FAILED: WordPress failed to update to WordPress 7.0-RC2-62287
Updating to WordPress 7.0-RC2-62287
...
Copying the required files...
Could not copy file. twentytwentyfive/package-lock.json
Installation failed. twentytwentyfive/package-lock.json
Error: [copy_failed_copy_dir_themes] Could not copy file.
twentytwentyfive/package-lock.json
```
On extrachill.com (running on the wp-coding-agents harness), this has been a recurring problem and wasn't tied to the known root-ownership symptom from #93 — file ownership was actually correct (`opencode:www-data`, with `www-data` in the `opencode` supplementary group). Yet the auto-update path running as `www-data` couldn't overwrite the file.
Root cause
The harness sets up the cooperative-write convention (opencode in www-data group, `chmod -R g+w` at setup) but does not enforce a group-friendly umask on the long-running services that write into `$SITE_PATH`.
`wp-config.php` declares the standard:
```php
define('FS_CHMOD_DIR', (0775 & ~ umask()));
define('FS_CHMOD_FILE', (0664 & ~ umask()));
```
But `umask()` is read from the running process. On a stock Debian/Ubuntu install, systemd services run with umask `0022`, so:
Net effect: every file PHP-FPM (or a coding-agent process) creates lands at mode 0644, silently undoing the `g+w` the harness applied at setup. Once a file is 0644 and owned by `opencode`, the auto-update path (running as `www-data` via `wp-cron.php` over HTTPS) gets `EPERM` even though it's in the group — the group bit isn't writable.
WordPress auto-update is just the canary. Anything that flips between www-data and opencode as the writer (cron, wp-cli from the kimaki agent, plugin-installer, etc.) eventually hits this.
Diagnostic on extrachill.com
```
$ ls -la /var/www/extrachill.com/wp-content/themes/twentytwentyfive/package-lock.json
-rw-r--r-- 1 opencode www-data 54180 May 1 19:05 package-lock.json
^^^ owner can write, group CANNOT, even though group is www-data and we want it to
$ id www-data
uid=33(www-data) gid=33(www-data) groups=33(www-data)
$ id opencode
uid=1000(opencode) gid=1000(opencode) groups=1000(opencode),4(adm),33(www-data)
$ getent group www-data
www-data:x:33:opencode
$ cat /proc/$(pgrep -f "php-fpm: pool" | head -1)/status | grep -i umask
Umask: 0022
$ find /var/www/extrachill.com/wp-content/themes -type f ! -perm -g+w | wc -l
1508
$ find /var/www/extrachill.com/wp-content/plugins -type f ! -perm -g+w | wc -l
534
```
So 2042+ files are mode 0644 in dirs that the harness explicitly told setup to keep group-writable.
What's missing in the harness
`lib/infrastructure.sh` does:
```
useradd -m -s /bin/bash -G www-data "$SERVICE_USER"
chmod -R g+w "$SITE_PATH"
chown -R www-data:www-data "$SITE_PATH"
```
…but never:
Without (1), WP auto-updates and any plugin that writes during a web request silently produce 0644 files.
Without (2), every wp-cli or filesystem write from a coding-agent session does the same.
Proposed fix
For VPS installs (the scope where the harness manages systemd):
`bridges/kimaki.sh` (kimaki systemd unit template): add `UMask=0002` to `[Service]`.
`lib/infrastructure.sh` (PHP-FPM provisioning): drop a `/etc/systemd/system/php${PHP_VERSION}-fpm.service.d/umask.conf` with:
```
[Service]
UMask=0002
```
Then `systemctl daemon-reload && systemctl restart php${PHP_VERSION}-fpm`.
After applying the drop-in, the harness should run a one-time `find $SITE_PATH -type f ! -perm -g+w -exec chmod g+w {} +` and the equivalent for dirs, to repair any files already stamped at 0644 by the previous umask-0022 runtime. Otherwise the next auto-update still fails once before the umask change starts paying off.
`upgrade.sh` should idempotently apply the drop-in and the repair pass via `_smart_update_systemd_unit`-style logic so existing installs pick up the fix without re-running setup.
Why this is harness-level, not wp-config / per-site
Related
Acceptance criteria