Architecture

This document describes the architecture of the awesome-database-backup project, including command structure, execution flow, storage service clients, Docker configuration, and CI/CD setup.

Command Structure

Each command (backup, restore, list, prune) is implemented by inheriting from common base classes:

StorageServiceClientCommand: Base class for all commands
- Provides common functionality related to storage services (S3, GCS)
- Creates and manages storage service clients
Command-specific base classes:
- BackupCommand: Base class for backup commands
- RestoreCommand: Base class for restore commands
- ListCommand: Base class for list commands
- PruneCommand: Base class for prune commands
Database-specific implementation classes:
- MongoDBBackupCommand: Backup command for MongoDB
- PostgreSQLBackupCommand: Backup command for PostgreSQL
- MariaDBBackupCommand: Backup command for MariaDB
- etc.

Backup and Restore Execution Flow

Backup Execution Modes

Backup commands have the following execution modes:

Normal mode (using temporary files)

backupOnce(): Executes backup once
Saves database dump to a temporary file, then uploads it to the storage service

Implementation example:

async backupOnce(options: IBackupCommandOption): Promise<void> {
  // Dump database to a temporary file
  const { dbDumpFilePath } = await this.dumpDB(options);
  // Upload the temporary file to the storage service
  await this.storageServiceClient.copyFile(dbDumpFilePath, options.targetBucketUrl.toString());
}

Streaming mode (without using temporary files)

backupOnceWithStream(): Executes backup in streaming mode
Uploads database dump directly to the storage service as a stream
Reduces disk I/O and is efficient for transferring large data

Implementation example:

async backupOnceWithStream(options: IBackupCommandOption): Promise<void> {
  // Get database dump as a stream
  const stream = await this.dumpDBAsStream(options);
  // Upload the stream directly to the storage service
  await this.storageServiceClient.uploadStream(
    stream,
    `${options.backupfilePrefix}-${format(Date.now(), 'yyyyMMddHHmmss')}.gz`,
    options.targetBucketUrl.toString(),
  );
}

Cron mode

backupCronMode(): Executes backup periodically
Executes backup periodically based on the specified cron expression

Implementation example:

async backupCronMode(options: IBackupCommandOption): Promise<void> {
  await schedule.scheduleJob(
    options.cronmode,
    async() => {
      await this.backupOnce(options);
    },
  );
}

Restore Execution Flow

The restore command restores the database using the following steps:

Download backup file
- Download backup file from cloud storage
- Save to a temporary directory

Extract backup file

Extract file according to compression format (.gz, .bz2, .tar, etc.)

Implementation example:

async processBackupFile(backupFilePath: string): Promise<string> {
  const processors: Record<string, Transform> = {
    '.gz': createGunzip(),
    '.bz2': bz2(),
    '.tar': new tar.Unpack({ cwd: dirname(backupFilePath) }),
  };

  let newBackupFilePath = backupFilePath;
  const streams: (Transform|ReadStream|WriteStream)[] = [];

  streams.push(createReadStream(backupFilePath));
  while (extname(newBackupFilePath) !== '') {
    const ext = extname(newBackupFilePath);
    if (processors[ext] == null) throw new Error(`Extension ${ext} is not supported`);

    streams.push(processors[ext]);
    newBackupFilePath = newBackupFilePath.slice(0, -ext.length);
  }
  // If last stream is not of '.tar', add file writing stream.
  if (streams.at(-1) !== processors['.tar']) {
    streams.push(createWriteStream(newBackupFilePath));
  }

  return StreamPromises
    .pipeline(streams)
    .then(() => newBackupFilePath);
}

Restore to database

Restore database using the extracted file
Execute database-specific restore command

Implementation example (MongoDB):

async restoreDB(sourcePath: string, userSpecifiedOption?: string): Promise<{ stdout: string; stderr: string; }> {
  return exec(`mongorestore ${sourcePath} ${userSpecifiedOption}`);
}

Clean up temporary files
- Delete temporary files after restore is complete

Storage Service Clients

Storage service clients provide operations for cloud storage services such as S3 and GCS:

IStorageServiceClient: Interface for storage service clients
- exists(): Check if a file exists
- listFiles(): Get file list
- deleteFile(): Delete a file
- copyFile(): Copy a file
- uploadStream(): Upload from a stream
Concrete implementations:
- S3StorageServiceClient: Client for S3
  - Uses AWS SDK for JavaScript
  - Supports S3-compatible services (e.g., DigitalOcean Spaces)
  - Custom endpoint URL configuration is possible
  - Supports multiple authentication methods:
    - Explicit credentials (access key and secret key)
    - AWS configuration files (~/.aws/config and ~/.aws/credentials)
    - Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
    - AWS STS with WebIdentity tokens (for Kubernetes service accounts)
    - EC2 instance profiles
    - ECS container credentials
- GCSStorageServiceClient: Client for GCS
  - Uses Google Cloud Storage Node.js client library
  - Supports service account authentication
  - Supports Application Default Credentials
Factory pattern:
- storageServiceClientFactory: Creates appropriate client based on URL scheme
- getStorageServiceClientType: Determines client type from URL scheme
```
// s3://bucket-name/path -> 'S3'
// gs://bucket-name/path -> 'GCS'
```
URI parsing:
- S3 URI: s3://bucket-name/path -> { bucket: 'bucket-name', key: 'path' }
- GCS URI: gs://bucket-name/path -> { bucket: 'bucket-name', filepath: 'path' }

File Operation Types

Storage service clients support the following types of file operations:

Local file → Cloud storage (Upload)

// For S3
async uploadFile(sourceFilePath: string, destinationS3Uri: S3URI): Promise<void> {
  const params = {
    Bucket: destinationS3Uri.bucket,
    Key: destinationS3Uri.key,
    Body: readFileSync(sourceFilePath),
  };
  const command = new PutObjectCommand(params);
  await this.client.send(command);
}

Cloud storage → Local file (Download)

// For S3
async downloadFile(sourceS3Uri: S3URI, destinationFilePath: string): Promise<void> {
  const params = {
    Bucket: sourceS3Uri.bucket,
    Key: sourceS3Uri.key,
  };
  const command = new GetObjectCommand(params);
  const response = await this.client.send(command);
  await StreamPromises.pipeline(
    response.Body as Readable,
    createWriteStream(destinationFilePath),
  );
}

Copy within cloud storage

// For S3
async copyFileOnRemote(sourceS3Uri: S3URI, destinationS3Uri: S3URI): Promise<void> {
  const params = {
    CopySource: [sourceS3Uri.bucket, sourceS3Uri.key].join('/'),
    Bucket: destinationS3Uri.bucket,
    Key: destinationS3Uri.key,
  };
  const command = new CopyObjectCommand(params);
  await this.client.send(command);
}

Upload from stream

// For S3
async uploadStream(stream: Readable, fileName: string, destinationUri: string): Promise<void> {
  const destinationS3Uri = this._parseFilePath(destinationUri);
  const chunks: Buffer[] = [];
  for await (const chunk of stream) {
    chunks.push(Buffer.from(chunk));
  }
  const buffer = Buffer.concat(chunks);
  const params = {
    Bucket: destinationS3Uri.bucket,
    Key: [destinationS3Uri.key, fileName].join('/'),
    Body: buffer,
  };
  const command = new PutObjectCommand(params);
  await this.client.send(command);
}

Docker

Docker images are created using multi-stage builds:

Multi-stage Build Configuration

Base image: node with appropriate distribution
- Note: The OS distribution of the base image should match the distribution used in the development container to ensure compatibility

Prune stage: Extract only necessary packages from the monorepo

FROM base AS pruned-package
ARG packageFilter
COPY . .
RUN npx turbo@2 prune --docker "${packageFilter}"

Dependency resolution stage: Install dependencies for required packages

FROM base AS deps-resolver
COPY --from=pruned-package --chown=node:node ${optDir}/out/json/ .
COPY --from=pruned-package --chown=node:node ${optDir}/out/yarn.lock ./yarn.lock
RUN yarn --frozen-lockfile

Build stage: Build the application

FROM base AS builder
ARG packageFilter
ENV NODE_ENV=production
COPY --from=deps-resolver --chown=node:node ${optDir}/ ${optDir}/
COPY --from=pruned-package --chown=node:node ${optDir}/out/full/ ${optDir}/
RUN yarn run turbo run build --filter="${packageFilter}..."

Tool stage: Install database-specific tools

FROM base AS tool-common
RUN apt-get update && apt-get install -y bzip2 curl ca-certificates gnupg

FROM tool-common AS mongodb-tools
ARG dbToolVersion
RUN . /tmp/install-functions.sh && install_mongo_tools "${dbToolVersion}"

FROM tool-common AS postgresql-tools
ARG dbToolVersion
RUN . /tmp/install-functions.sh && install_postgresql_tools "${dbToolVersion}"

FROM tool-common AS mariadb-tools
ARG dbToolVersion
RUN . /tmp/install-functions.sh && install_mariadb_tools "${dbToolVersion}"

Release stage: Create the final image

FROM ${dbType:-file}-tools AS release
ARG packagePath
ENV NODE_ENV=production
COPY --from=builder --chown=node:node ${optDir}/ ${appDir}/
COPY ./docker/entrypoint.sh ${appDir}/
ENTRYPOINT ["/app/entrypoint.sh"]
CMD ["backup", "list", "prune"]

Building Docker Images

Different images can be built for each database type:

# Build image for MongoDB backup
docker build -t awesome-mongodb-backup \
  --build-arg dbType=mongodb \
  --build-arg dbToolVersion=100.10.0 \
  --build-arg packageFilter=awesome-mongodb-backup \
  --build-arg packagePath=apps/awesome-mongodb-backup \
  -f docker/Dockerfile .

# Build image for PostgreSQL backup
docker build -t awesome-postgresql-backup \
  --build-arg dbType=postgresql \
  --build-arg dbToolVersion=17 \
  --build-arg packageFilter=awesome-postgresql-backup \
  --build-arg packagePath=apps/awesome-postgresql-backup \
  -f docker/Dockerfile .

# Build image for MariaDB backup
docker build -t awesome-mariadb-backup \
  --build-arg dbType=mariadb \
  --build-arg dbToolVersion=11.7.2 \
  --build-arg packageFilter=awesome-mariadb-backup \
  --build-arg packagePath=apps/awesome-mariadb-backup \
  -f docker/Dockerfile .

Docker Image Usage Examples

# Run MongoDB backup
docker run --rm \
  -e TARGET_BUCKET_URL=s3://my-bucket/backups/ \
  -e AWS_REGION=us-east-1 \
  -e AWS_ACCESS_KEY_ID=your-access-key \
  -e AWS_SECRET_ACCESS_KEY=your-secret-key \
  -e BACKUP_TOOL_OPTIONS="--uri mongodb://mongo:27017/mydb" \
  awesome-mongodb-backup backup

# List backup files
docker run --rm \
  -e TARGET_BUCKET_URL=s3://my-bucket/backups/ \
  -e AWS_REGION=us-east-1 \
  -e AWS_ACCESS_KEY_ID=your-access-key \
  -e AWS_SECRET_ACCESS_KEY=your-secret-key \
  awesome-mongodb-backup list

# Delete old backup files
docker run --rm \
  -e TARGET_BUCKET_URL=s3://my-bucket/backups/ \
  -e AWS_REGION=us-east-1 \
  -e AWS_ACCESS_KEY_ID=your-access-key \
  -e AWS_SECRET_ACCESS_KEY=your-secret-key \
  awesome-mongodb-backup prune

CI/CD

GitHub Actions are used to run the following workflows:

Application Test Workflow

Defined in .github/workflows/app-test.yaml:

name: Application - Test

on:
  push:
    branches-ignore: [stable]
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  app-test:
    runs-on: ubuntu-latest
    steps:
    - name: Get UID/GID
      id: ids
      run: |
        echo "uid=$(id -u)" >> $GITHUB_OUTPUT
        echo "gid=$(id -g)" >> $GITHUB_OUTPUT
    - uses: actions/checkout@v4
    # The "docker-container" driver is used because of using cache
    - uses: docker/setup-buildx-action@v3
    # Store the App container image in the local and Github Action cache, respectively
    - name: Build app container to caching (No push)
      uses: docker/build-push-action@v6
      with:
        context: .devcontainer
        build-args: |
          USER_UID=${{ steps.ids.outputs.uid }}
          USER_GID=${{ steps.ids.outputs.gid }}
        load: true
        cache-from: type=gha
        cache-to: type=gha,mode=max
    - name: Start all DBs and middle
      run: |
        docker compose -f .devcontainer/compose.yml build --build-arg USER_UID=${{ steps.ids.outputs.uid }} --build-arg USER_GID=${{ steps.ids.outputs.gid }}
        docker compose -f .devcontainer/compose.yml up -d
    - name: Run test
      run:
        docker compose -f .devcontainer/compose.yml exec -e NODE_OPTIONS -e CI -T -- node bash -c 'yarn install && yarn test'
    - name: Show test report to result of action
      if: success() || failure()
      uses: ctrf-io/github-test-reporter@v1
      with:
        report-path: '**/ctrf/*.json'
    - name: Show coverage report follow the settings .octocov.yml
      if: success() || failure()
      uses: k1LoW/octocov-action@v1

Container Test Workflow

Defined in .github/workflows/container-test.yaml:

name: Container - Test

on:
  push:
    branches-ignore: [stable]
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  container-test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        db-type: [mongodb, postgresql, mariadb, file]
    steps:
    - uses: actions/checkout@v4
    # The "docker-container" driver is used because of using cache
    - uses: docker/setup-buildx-action@v3
    # Build container image
    - name: Build container image
      uses: docker/build-push-action@v6
      with:
        context: .
        file: docker/Dockerfile
        build-args: |
          dbType=${{ matrix.db-type }}
          dbToolVersion=100.10.0
          packageFilter=awesome-${{ matrix.db-type }}-backup
          packagePath=apps/awesome-${{ matrix.db-type }}-backup
        load: true
        tags: awesome-${{ matrix.db-type }}-backup:test
        cache-from: type=gha
        cache-to: type=gha,mode=max
    # Test container image
    - name: Test container image
      run: |
        docker run --rm awesome-${{ matrix.db-type }}-backup:test --help

Test Reports

Test results are output in the following formats:

Jest Test Report:
- When the CI environment variable is set (in CI environments), test reports are created in **/ctrf/*.json format
- When the CI environment variable is not set (local development), only test summaries are displayed in the console
- The ctrf-io/github-test-reporter action is used to display these reports in GitHub Actions
Coverage Report: Generated according to .octocov.yml settings
- Displayed using k1LoW/octocov-action

Workflow Features

Concurrency control
```
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
```
- Prevents multiple workflows from running for the same branch
- Cancels in-progress workflows when new commits are pushed
Cache utilization
```
cache-from: type=gha
cache-to: type=gha,mode=max
```
- Uses GitHub Actions cache to reduce build time

Matrix builds

strategy:
  matrix:
    db-type: [mongodb, postgresql, mariadb, file]

Runs tests in parallel for multiple database types

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!