Skip to content

syncdb: uncompressed transfer ~5.5x slower than necessary + dangerous cache:rebuild in db-post-import hook #46

@tormi

Description

@tormi

Summary

ddev syncdb transfers database dumps uncompressed over SSH, making it ~5.5x slower than necessary. For a ~629MB database, the current approach takes ~20 minutes vs ~3.5 minutes with gzip compression.

Additionally, db-post-import.sh has a dangerous cache:rebuild placement that can cause fatal errors during breaking schema upgrades.

Benchmark results

Step Current (uncompressed) With --gzip
SSH dump + transfer ~18 min (629MB raw) ~1 min 53s (65MB gz)
ddev import-db ~1 min 12s ~1 min 18s
Post-import hooks ~45s ~45s
Total ~20 min ~3 min 38s

Raw timing:

# Current
ddev syncdb main --keep-dump  7.91s user 7.03s system 1% cpu 20:00.31 total

# Gzipped dump transfer only
ssh ... "drush sql-dump --gzip ..." > dump.sql.gz  0.62s user 0.29s system 0% cpu 1:53.05 total

# Import compressed dump
ddev import-db --file=dump.sql.gz  1.60s user 1.21s system 2% cpu 1:45.03 total

Tested on DDEV v1.25.1, macOS arm64, Docker Desktop.

Improvements

1. Use --gzip for compressed transfer (critical — ~5.5x speedup)

The dump command (line 111 of syncdb.sh) should use gzip:

# Current
"${ssh_command[@]}" "drush sql-dump --structure-tables-list=cache,cache_*,..." > "$sql_file"

# Proposed
sql_file="$DUMPS_DIR/${ALIAS_KEY}-syncdb-$(date +'%Y-%m-%d').sql.gz"
"${ssh_command[@]}" "drush sql-dump --gzip --structure-tables-list=cache,cache_*,..." > "$sql_file"
ddev import-db --file="$sql_file"

ddev import-db natively handles .gz files.

2. Reduce ddev yq overhead (~4.8s → ~1.6s)

The script calls ddev yq three separate times to parse alias details. Each call takes ~1.6s (container startup overhead), totaling ~4.8s. Use a single ddev yq call to extract all fields at once:

eval $(ddev yq '."@self.'"$ALIAS_KEY"'" | "user=" + .user + " host=" + .host + " options=" + .ssh.options' <<< "$alias_details_clean")

3. Validate drush/sites/self.site.yml and alias upfront

The script relies on this file but only discovers it's missing after a failed ddev drush sa call (~3-5s wasted). A quick upfront check would give immediate, actionable errors:

SITE_YML="$PROJECT_ROOT/drush/sites/self.site.yml"
if [[ ! -f "$SITE_YML" ]]; then
  display_error_message "Missing drush/sites/self.site.yml"
  display_warning_message "This file defines SSH aliases for remote environments."
  display_warning_message "See https://www.drush.org/13.x/site-aliases/ for format."
  exit 1
fi

if ! grep -q "^${ALIAS_KEY}:" "$SITE_YML"; then
  display_error_message "Alias '${ALIAS_KEY}' not found in drush/sites/self.site.yml"
  display_warning_message "Available aliases:"
  grep -E '^[a-zA-Z]' "$SITE_YML" | sed 's/:$//' | while read -r alias; do
    display_warning_message "  - $alias"
  done
  exit 1
fi

Especially important for projects onboarding to the addon.

4. Add --backup flag

The previous project-level syncdb implementations supported --backup to create a local database backup before overwriting:

if [[ "$BACKUP" == "true" ]]; then
  ddev export-db --file="$DUMPS_DIR/backup-$(date +'%Y-%m-%d-%H%M%S').sql.gz" --gzip
fi

5. Add --force flag and confirmation prompt

The script immediately overwrites the local database without confirmation. A prompt (skippable with --force) would prevent accidental overwrites.

6. Use SSH -C as fallback

If --gzip is not available on the remote Drush, enable SSH compression as a fallback.

7. db-post-import.sh: dangerous cache:rebuild placement + redundancy

The current db-post-import.sh hook runs:

drush updatedb --no-cache-clear -y
drush sqlsan -y
drush cache:rebuild          # 1st rebuild — BEFORE config:import
drush config:import -y
drush cache:rebuild          # 2nd rebuild (redundant)
drush deploy:hook

The cache:rebuild between updatedb and config:import is actively dangerous when the imported DB has significant schema differences from the current codebase. drush cache:rebuild bootstraps Drupal and tries to access tables/columns that may not exist yet (or have been removed), causing fatal errors.

This is the exact problem fixed in wunderio/charts#514: the Silta helm charts had to remove drush cache-rebuild after reference database import because it broke deployments with breaking schema changes.

The correct sequence (as drush deploy implements) is:

  1. updatedb — apply pending database updates
  2. config:import — synchronize configuration
  3. cache:rebuildonly after schema and config are current

Additionally, the hook fires on every ddev import-db (not just ddev syncdb) — importing local backups, restoring snapshots, debugging specific DB states all trigger the full deploy pipeline unnecessarily.

Suggestion: Strip the hook to sanitization only (drush sqlsan -y), letting developers run ddev drush deploy as a deliberate separate step. This:

  • Keeps ddev import-db fast for all use cases
  • Preserves drush deploy's correct ordering
  • Eliminates the dangerous pre-config cache:rebuild
  • Eliminates the redundant double cache:rebuild (~15-20s)
  • Gives developers control over when deploy runs

Impact summary

# Improvement Effort Impact
1 Add --gzip to dump Trivial ~5.5x speedup
2 Single yq call Low ~3-5s saved
3 Validate self.site.yml + alias upfront Low Better DX
4 --backup flag Low Safety
5 --force / confirmation Low Safety
6 SSH -C fallback Trivial Minor fallback
7 Fix db-post-import.sh hook Low Breaking: devs must run ddev drush deploy manually after import. ~45s saved, fixes breaking upgrades. Removes cache:rebuild before config:import that causes fatal errors on schema changes (see wunderio/charts#514)

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions