This document describes the architecture of the awesome-database-backup project, including command structure, execution flow, storage service clients, Docker configuration, and CI/CD setup.
Each command (backup, restore, list, prune) is implemented by inheriting from common base classes:
-
StorageServiceClientCommand
: Base class for all commands- Provides common functionality related to storage services (S3, GCS)
- Creates and manages storage service clients
-
Command-specific base classes:
BackupCommand
: Base class for backup commandsRestoreCommand
: Base class for restore commandsListCommand
: Base class for list commandsPruneCommand
: Base class for prune commands
-
Database-specific implementation classes:
MongoDBBackupCommand
: Backup command for MongoDBPostgreSQLBackupCommand
: Backup command for PostgreSQLMariaDBBackupCommand
: Backup command for MariaDB- etc.
Backup commands have the following execution modes:
-
Normal mode (using temporary files)
backupOnce()
: Executes backup once- Saves database dump to a temporary file, then uploads it to the storage service
- Implementation example:
async backupOnce(options: IBackupCommandOption): Promise<void> { // Dump database to a temporary file const { dbDumpFilePath } = await this.dumpDB(options); // Upload the temporary file to the storage service await this.storageServiceClient.copyFile(dbDumpFilePath, options.targetBucketUrl.toString()); }
-
Streaming mode (without using temporary files)
backupOnceWithStream()
: Executes backup in streaming mode- Uploads database dump directly to the storage service as a stream
- Reduces disk I/O and is efficient for transferring large data
- Implementation example:
async backupOnceWithStream(options: IBackupCommandOption): Promise<void> { // Get database dump as a stream const stream = await this.dumpDBAsStream(options); // Upload the stream directly to the storage service await this.storageServiceClient.uploadStream( stream, `${options.backupfilePrefix}-${format(Date.now(), 'yyyyMMddHHmmss')}.gz`, options.targetBucketUrl.toString(), ); }
-
Cron mode
backupCronMode()
: Executes backup periodically- Executes backup periodically based on the specified cron expression
- Implementation example:
async backupCronMode(options: IBackupCommandOption): Promise<void> { await schedule.scheduleJob( options.cronmode, async() => { await this.backupOnce(options); }, ); }
The restore command restores the database using the following steps:
-
Download backup file
- Download backup file from cloud storage
- Save to a temporary directory
-
Extract backup file
- Extract file according to compression format (.gz, .bz2, .tar, etc.)
- Implementation example:
async processBackupFile(backupFilePath: string): Promise<string> { const processors: Record<string, Transform> = { '.gz': createGunzip(), '.bz2': bz2(), '.tar': new tar.Unpack({ cwd: dirname(backupFilePath) }), }; let newBackupFilePath = backupFilePath; const streams: (Transform|ReadStream|WriteStream)[] = []; streams.push(createReadStream(backupFilePath)); while (extname(newBackupFilePath) !== '') { const ext = extname(newBackupFilePath); if (processors[ext] == null) throw new Error(`Extension ${ext} is not supported`); streams.push(processors[ext]); newBackupFilePath = newBackupFilePath.slice(0, -ext.length); } // If last stream is not of '.tar', add file writing stream. if (streams.at(-1) !== processors['.tar']) { streams.push(createWriteStream(newBackupFilePath)); } return StreamPromises .pipeline(streams) .then(() => newBackupFilePath); }
-
Restore to database
- Restore database using the extracted file
- Execute database-specific restore command
- Implementation example (MongoDB):
async restoreDB(sourcePath: string, userSpecifiedOption?: string): Promise<{ stdout: string; stderr: string; }> { return exec(`mongorestore ${sourcePath} ${userSpecifiedOption}`); }
-
Clean up temporary files
- Delete temporary files after restore is complete
Storage service clients provide operations for cloud storage services such as S3 and GCS:
-
IStorageServiceClient
: Interface for storage service clientsexists()
: Check if a file existslistFiles()
: Get file listdeleteFile()
: Delete a filecopyFile()
: Copy a fileuploadStream()
: Upload from a stream
-
Concrete implementations:
S3StorageServiceClient
: Client for S3- Uses AWS SDK for JavaScript
- Supports S3-compatible services (e.g., DigitalOcean Spaces)
- Custom endpoint URL configuration is possible
- Supports multiple authentication methods:
- Explicit credentials (access key and secret key)
- AWS configuration files (~/.aws/config and ~/.aws/credentials)
- Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
- AWS STS with WebIdentity tokens (for Kubernetes service accounts)
- EC2 instance profiles
- ECS container credentials
GCSStorageServiceClient
: Client for GCS- Uses Google Cloud Storage Node.js client library
- Supports service account authentication
- Supports Application Default Credentials
-
Factory pattern:
storageServiceClientFactory
: Creates appropriate client based on URL schemegetStorageServiceClientType
: Determines client type from URL scheme
// s3://bucket-name/path -> 'S3' // gs://bucket-name/path -> 'GCS'
-
URI parsing:
- S3 URI:
s3://bucket-name/path
->{ bucket: 'bucket-name', key: 'path' }
- GCS URI:
gs://bucket-name/path
->{ bucket: 'bucket-name', filepath: 'path' }
- S3 URI:
Storage service clients support the following types of file operations:
-
Local file → Cloud storage (Upload)
// For S3 async uploadFile(sourceFilePath: string, destinationS3Uri: S3URI): Promise<void> { const params = { Bucket: destinationS3Uri.bucket, Key: destinationS3Uri.key, Body: readFileSync(sourceFilePath), }; const command = new PutObjectCommand(params); await this.client.send(command); }
-
Cloud storage → Local file (Download)
// For S3 async downloadFile(sourceS3Uri: S3URI, destinationFilePath: string): Promise<void> { const params = { Bucket: sourceS3Uri.bucket, Key: sourceS3Uri.key, }; const command = new GetObjectCommand(params); const response = await this.client.send(command); await StreamPromises.pipeline( response.Body as Readable, createWriteStream(destinationFilePath), ); }
-
Copy within cloud storage
// For S3 async copyFileOnRemote(sourceS3Uri: S3URI, destinationS3Uri: S3URI): Promise<void> { const params = { CopySource: [sourceS3Uri.bucket, sourceS3Uri.key].join('/'), Bucket: destinationS3Uri.bucket, Key: destinationS3Uri.key, }; const command = new CopyObjectCommand(params); await this.client.send(command); }
-
Upload from stream
// For S3 async uploadStream(stream: Readable, fileName: string, destinationUri: string): Promise<void> { const destinationS3Uri = this._parseFilePath(destinationUri); const chunks: Buffer[] = []; for await (const chunk of stream) { chunks.push(Buffer.from(chunk)); } const buffer = Buffer.concat(chunks); const params = { Bucket: destinationS3Uri.bucket, Key: [destinationS3Uri.key, fileName].join('/'), Body: buffer, }; const command = new PutObjectCommand(params); await this.client.send(command); }
Docker images are created using multi-stage builds:
-
Base image:
node
with appropriate distribution- Note: The OS distribution of the base image should match the distribution used in the development container to ensure compatibility
-
Prune stage: Extract only necessary packages from the monorepo
FROM base AS pruned-package ARG packageFilter COPY . . RUN npx turbo@2 prune --docker "${packageFilter}"
-
Dependency resolution stage: Install dependencies for required packages
FROM base AS deps-resolver COPY --from=pruned-package --chown=node:node ${optDir}/out/json/ . COPY --from=pruned-package --chown=node:node ${optDir}/out/yarn.lock ./yarn.lock RUN yarn --frozen-lockfile
-
Build stage: Build the application
FROM base AS builder ARG packageFilter ENV NODE_ENV=production COPY --from=deps-resolver --chown=node:node ${optDir}/ ${optDir}/ COPY --from=pruned-package --chown=node:node ${optDir}/out/full/ ${optDir}/ RUN yarn run turbo run build --filter="${packageFilter}..."
-
Tool stage: Install database-specific tools
FROM base AS tool-common RUN apt-get update && apt-get install -y bzip2 curl ca-certificates gnupg FROM tool-common AS mongodb-tools ARG dbToolVersion RUN . /tmp/install-functions.sh && install_mongo_tools "${dbToolVersion}" FROM tool-common AS postgresql-tools ARG dbToolVersion RUN . /tmp/install-functions.sh && install_postgresql_tools "${dbToolVersion}" FROM tool-common AS mariadb-tools ARG dbToolVersion RUN . /tmp/install-functions.sh && install_mariadb_tools "${dbToolVersion}"
-
Release stage: Create the final image
FROM ${dbType:-file}-tools AS release ARG packagePath ENV NODE_ENV=production COPY --from=builder --chown=node:node ${optDir}/ ${appDir}/ COPY ./docker/entrypoint.sh ${appDir}/ ENTRYPOINT ["/app/entrypoint.sh"] CMD ["backup", "list", "prune"]
Different images can be built for each database type:
# Build image for MongoDB backup
docker build -t awesome-mongodb-backup \
--build-arg dbType=mongodb \
--build-arg dbToolVersion=100.10.0 \
--build-arg packageFilter=awesome-mongodb-backup \
--build-arg packagePath=apps/awesome-mongodb-backup \
-f docker/Dockerfile .
# Build image for PostgreSQL backup
docker build -t awesome-postgresql-backup \
--build-arg dbType=postgresql \
--build-arg dbToolVersion=17 \
--build-arg packageFilter=awesome-postgresql-backup \
--build-arg packagePath=apps/awesome-postgresql-backup \
-f docker/Dockerfile .
# Build image for MariaDB backup
docker build -t awesome-mariadb-backup \
--build-arg dbType=mariadb \
--build-arg dbToolVersion=11.7.2 \
--build-arg packageFilter=awesome-mariadb-backup \
--build-arg packagePath=apps/awesome-mariadb-backup \
-f docker/Dockerfile .
# Run MongoDB backup
docker run --rm \
-e TARGET_BUCKET_URL=s3://my-bucket/backups/ \
-e AWS_REGION=us-east-1 \
-e AWS_ACCESS_KEY_ID=your-access-key \
-e AWS_SECRET_ACCESS_KEY=your-secret-key \
-e BACKUP_TOOL_OPTIONS="--uri mongodb://mongo:27017/mydb" \
awesome-mongodb-backup backup
# List backup files
docker run --rm \
-e TARGET_BUCKET_URL=s3://my-bucket/backups/ \
-e AWS_REGION=us-east-1 \
-e AWS_ACCESS_KEY_ID=your-access-key \
-e AWS_SECRET_ACCESS_KEY=your-secret-key \
awesome-mongodb-backup list
# Delete old backup files
docker run --rm \
-e TARGET_BUCKET_URL=s3://my-bucket/backups/ \
-e AWS_REGION=us-east-1 \
-e AWS_ACCESS_KEY_ID=your-access-key \
-e AWS_SECRET_ACCESS_KEY=your-secret-key \
awesome-mongodb-backup prune
GitHub Actions are used to run the following workflows:
Defined in .github/workflows/app-test.yaml
:
name: Application - Test
on:
push:
branches-ignore: [stable]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
app-test:
runs-on: ubuntu-latest
steps:
- name: Get UID/GID
id: ids
run: |
echo "uid=$(id -u)" >> $GITHUB_OUTPUT
echo "gid=$(id -g)" >> $GITHUB_OUTPUT
- uses: actions/checkout@v4
# The "docker-container" driver is used because of using cache
- uses: docker/setup-buildx-action@v3
# Store the App container image in the local and Github Action cache, respectively
- name: Build app container to caching (No push)
uses: docker/build-push-action@v6
with:
context: .devcontainer
build-args: |
USER_UID=${{ steps.ids.outputs.uid }}
USER_GID=${{ steps.ids.outputs.gid }}
load: true
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Start all DBs and middle
run: |
docker compose -f .devcontainer/compose.yml build --build-arg USER_UID=${{ steps.ids.outputs.uid }} --build-arg USER_GID=${{ steps.ids.outputs.gid }}
docker compose -f .devcontainer/compose.yml up -d
- name: Run test
run:
docker compose -f .devcontainer/compose.yml exec -e NODE_OPTIONS -e CI -T -- node bash -c 'yarn install && yarn test'
- name: Show test report to result of action
if: success() || failure()
uses: ctrf-io/github-test-reporter@v1
with:
report-path: '**/ctrf/*.json'
- name: Show coverage report follow the settings .octocov.yml
if: success() || failure()
uses: k1LoW/octocov-action@v1
Defined in .github/workflows/container-test.yaml
:
name: Container - Test
on:
push:
branches-ignore: [stable]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
container-test:
runs-on: ubuntu-latest
strategy:
matrix:
db-type: [mongodb, postgresql, mariadb, file]
steps:
- uses: actions/checkout@v4
# The "docker-container" driver is used because of using cache
- uses: docker/setup-buildx-action@v3
# Build container image
- name: Build container image
uses: docker/build-push-action@v6
with:
context: .
file: docker/Dockerfile
build-args: |
dbType=${{ matrix.db-type }}
dbToolVersion=100.10.0
packageFilter=awesome-${{ matrix.db-type }}-backup
packagePath=apps/awesome-${{ matrix.db-type }}-backup
load: true
tags: awesome-${{ matrix.db-type }}-backup:test
cache-from: type=gha
cache-to: type=gha,mode=max
# Test container image
- name: Test container image
run: |
docker run --rm awesome-${{ matrix.db-type }}-backup:test --help
Test results are output in the following formats:
-
Jest Test Report:
- When the CI environment variable is set (in CI environments), test reports are created in
**/ctrf/*.json
format - When the CI environment variable is not set (local development), only test summaries are displayed in the console
- The ctrf-io/github-test-reporter action is used to display these reports in GitHub Actions
- When the CI environment variable is set (in CI environments), test reports are created in
-
Coverage Report: Generated according to
.octocov.yml
settings- Displayed using k1LoW/octocov-action
-
Concurrency control
concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true
- Prevents multiple workflows from running for the same branch
- Cancels in-progress workflows when new commits are pushed
-
Cache utilization
cache-from: type=gha cache-to: type=gha,mode=max
- Uses GitHub Actions cache to reduce build time
-
Matrix builds
strategy: matrix: db-type: [mongodb, postgresql, mariadb, file]
- Runs tests in parallel for multiple database types