Skip to content

faisalbasra/contentful-batch-migrator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

20 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Contentful Batch Migrator

License: MIT Node Version

A robust solution for migrating large Contentful spaces without hitting rate limits.

Migrate thousands of assets and entries between Contentful spaces by intelligently splitting them into manageable batches. Perfect for moving content between regions, environments, or organizations.

πŸ† Battle-Tested Workflow

New to this tool? Start with the Proven Migration Workflow - our battle-tested, step-by-step guide that has successfully migrated 10,000+ assets (12.6GB) and 25,000+ entries with 4,000+ circular dependencies.

What's inside:

  • βœ… Complete workflow from export to validation restore
  • βœ… Two proven import methods (Official CLI vs. Custom Script)
  • βœ… Draft cleanup (2,000+ drafts) and validation stripping (50+ content types)
  • βœ… Assets-first import strategy
  • βœ… Brute force publishing for massive circular dependencies
  • βœ… Real production examples with timelines (12-14 hours total)

πŸ“– Read the Proven Workflow Guide β†’

πŸš€ Features

  • Batch Processing: Automatically split large exports into configurable batch sizes
  • Client-Side Rate Limiting: Token bucket algorithm enforces API rate limits (10 req/sec, 36K req/hour)
  • Draft Cleanup: Identify and remove invalid/orphan draft entries before migration
  • Smart Publishing: Multiple publishing strategies for handling circular dependencies
  • Tag Management: Tag and filter draft entries for selective publishing
  • Brute Force Publishing: Repeatedly publish drafts until circular dependencies resolve
  • Smart Relationships: Maintains asset-entry relationships across batches
  • Resume Support: Automatically resume failed or interrupted migrations
  • Progress Tracking: Detailed logs and state management
  • Validation: Post-migration validation to ensure data integrity
  • Retry Logic: Configurable retry attempts with exponential backoff

πŸ“Š Use Cases

βœ… Migrating 1,000+ assets and entries βœ… Moving content between Contentful regions (US β†’ EU) βœ… Copying content between organizations βœ… Environment cloning with large datasets βœ… Avoiding "Too Many Requests" (429) errors

🎯 Problem & Solution

The Problem

Importing large Contentful exports (4,000+ assets, 10,000+ entries) directly causes:

  • Rate limiting errors (429 Too Many Requests)
  • Failed imports
  • Lost time and frustration

The Solution

This tool:

  1. Splits your export into batches (500-700 assets each)
  2. Maintains relationships between assets and entries
  3. Imports batches sequentially with delays
  4. Retries failed batches automatically
  5. Validates migration success

πŸ“¦ Installation

Prerequisites

  • Node.js >= 18.0.0 (LTS recommended)
  • npm or yarn
  • Contentful Management Token (CMA)

Quick Start

# 1. Clone the repository
git clone https://github.com/faisalbasra/contentful-batch-migrator.git
cd contentful-batch-migrator

# 2. Install dependencies
npm install

# 3. Set up configuration
npm run setup   # Interactive setup wizard (or manual setup below)

# Manual setup (alternative to setup wizard):
cp config/batch-config.example.json config/batch-config.json
cp config/cascade-config.example.json config/cascade-config.json
# Edit config files with your Contentful credentials

# 4. Run the migration
npm run split    # Step 1: Split export into batches
npm run import   # Step 2: Import batches sequentially
npm run validate # Step 3: Validate migration success

# 5. Publish content (if needed)
npm run cascade-publish  # Smart dependency-aware publishing
# OR
npm run publish-all      # Brute force publishing (for circular dependencies)

First time user? Check out the Getting Started Guide for a detailed walkthrough.

See all available commands:

npm run help

πŸ“š Available Commands

npm run help                          # Show all available commands

# Migration Commands
npm run cleanup-drafts                # Analyze and remove invalid drafts
npm run split                         # Split export into batches
npm run import                        # Import all batches
npm run import:cli                    # Import using CLI (recommended for assets)
npm run validate                      # Validate migration
npm run resume                        # Resume failed import
npm run resume:cli                    # Resume failed CLI import

# Publishing Commands
npm run publish-assets                # Publish all draft assets
npm run publish-assets:dry-run        # Preview asset publishing
npm run cascade-publish               # Smart publish with dependency resolution
npm run cascade-publish:dry-run       # Preview cascade publish
npm run publish-all                   # Brute force publish all drafts
npm run publish-all:dry-run           # Preview brute force publish
npm run tag-drafts <tag> [opts]       # Tag/untag draft entries

# Cleanup Commands
npm run clean                         # Remove batches directory
npm run clean:all                     # Remove batches and export
npm run clean-space [opts]            # Delete entries/content types/assets from space
npm run clean-space:dry-run           # Preview space cleanup

πŸ”§ Configuration

Edit batch-config.json:

{
  "batchSize": 400,
  "sourceFile": "./contentful-export/exported-space.json",
  "sourceAssetsDir": "./contentful-export",
  "outputDir": "./batches",
  "targetSpace": {
    "spaceId": "YOUR_TARGET_SPACE_ID",
    "environmentId": "master",
    "managementToken": "YOUR_CMA_TOKEN",
    "host": "api.contentful.com"
  },
  "importOptions": {
    "uploadAssets": true,
    "skipContentPublishing": false,
    "delayBetweenBatches": 180000,
    "maxRetries": 3,
    "retryDelay": 5000
  },
  "rateLimits": {
    "enabled": true,
    "requestsPerSecond": 10,
    "requestsPerHour": 36000,
    "verbose": true
  }
}

Configuration Options

Option Description Default Recommended
batchSize Assets per batch 400 400-700
delayBetweenBatches Wait time between batches (ms) 180000 180000-300000
maxRetries Retry attempts per batch 3 3-5
retryDelay Initial retry delay (ms) 5000 5000-10000
rateLimits.enabled Enable client-side rate limiting true true
rateLimits.requestsPerSecond Max requests per second 10 10
rateLimits.requestsPerHour Max requests per hour 36000 36000

πŸ“š Rate limiting details: docs/RATE-LIMITING.md

πŸ“– Usage

Step 1: Export from Source Space

First, export your content from the source Contentful space:

npx contentful-export \
  --space-id SOURCE_SPACE_ID \
  --management-token SOURCE_TOKEN \
  --export-dir ./contentful-export \
  --download-assets

πŸ“š Detailed guide: docs/EXPORT-GUIDE.md

Step 2: Clean Invalid Drafts (Optional but Recommended)

If your export contains draft entries with missing required fields or orphan drafts, clean them before importing:

npm run cleanup-drafts

Output:

πŸ“Š Analysis Summary:
─────────────────────────────────────────────────────────
Total Entries:                    11985
  β”œβ”€ Valid Published Entries:     11850
  β”œβ”€ Valid Draft Entries:         100
  β”œβ”€ Invalid Drafts:              25 ⚠️
  └─ Orphan Drafts:               10 ⚠️

Total Assets:                     4126
  β”œβ”€ Valid Published Assets:      4100
  β”œβ”€ Valid Draft Assets:          20
  └─ Invalid Asset Drafts:        6 ⚠️

Total Items to Remove:            41 πŸ—‘οΈ

What it does:

  • Identifies draft entries with missing required fields
  • Finds orphan drafts (content type doesn't exist)
  • Detects invalid asset drafts (missing files)
  • Creates draft-cleanup-report.json with detailed analysis
  • Generates cleaned export: contentful-export/exported-space-cleaned.json

Next: Update your batch-config.json to use the cleaned file:

{
  "sourceFile": "./contentful-export/exported-space-cleaned.json"
}

Step 3: Split the Export

Split your large export into batches:

npm run split

Output:

πŸš€ Starting Contentful Export Splitter...
πŸ“Š Source data summary:
  - Assets: 4126
  - Entries: 11985
πŸ“¦ Created 7 batches
βœ… Splitting completed successfully!

Creates batches/ directory with subdirectories for each batch.

Step 4: Import Batches

Import all batches sequentially:

npm run import

Features:

  • Automatically imports content model in first batch
  • Waits between batches (prevents rate limiting)
  • Retries failed batches
  • Saves progress state

Expected time: 3-5 hours for ~4,000 assets (with rate limiting enabled)

Step 5: Validate Migration

Verify the migration was successful:

npm run validate

Output:

βœ… Content Types         Source:     60 | Target:     60 | Diff: 0
βœ… Entries               Source:  11985 | Target:  11985 | Diff: 0
βœ… Assets                Source:   4126 | Target:   4126 | Diff: 0

πŸŽ‰ Validation passed! All data migrated successfully.

πŸ“š Detailed guide: docs/IMPORT-GUIDE.md

Step 6: Publish Draft Entries

After importing with skipContentPublishing: true, you need to publish draft entries. Choose the appropriate publishing strategy based on your needs:

Strategy 1: Cascade Publish (Recommended for Clean Dependencies)

Publishes entries in dependency order (entries with no dependencies first, then their dependents):

# Preview first
npm run cascade-publish:dry-run

# Publish
npm run cascade-publish

# Skip tagged entries
npm run cascade-publish -- --skip-tag skip-publish

Features:

  • βœ… Analyzes entry dependencies
  • βœ… Publishes in waves (depth-first)
  • βœ… Handles circular references gracefully
  • βœ… Can skip tagged entries with --skip-tag
  • ⚠️ May skip entries with circular dependencies

Output:

Analyzing dependencies...
Found 11985 draft entries
Publishing wave 1 (depth 0): 5000 entries
Publishing wave 2 (depth 1): 4000 entries
Publishing wave 3 (depth 2): 2500 entries
⚠️ Skipped 485 entries with circular dependencies

Strategy 2: Brute Force Publish (For Circular Dependencies)

Attempts to publish all drafts without analyzing dependencies. Run repeatedly until all are published:

# Preview first
npm run publish-all:dry-run

# Run repeatedly until complete
npm run publish-all
npm run publish-all  # Run again
npm run publish-all  # Keep running until failures reach 0

Features:

  • βœ… No dependency analysis required
  • βœ… Handles circular dependencies through repetition
  • βœ… Each run publishes what it can
  • βœ… Failed entries are saved to failed-entries.json

Output:

Total draft entries: 4000
βœ… Successfully published: 1200
❌ Failed to publish: 2800
⏭️ Skipped (already published): 0

πŸ’‘ TIP: Run this script again to retry failed entries.

How it works:

  1. First run: Publishes entries with no unpublished dependencies (~30%)
  2. Second run: Publishes entries whose dependencies were published in run 1 (~40%)
  3. Third run: More entries get published (~20%)
  4. Continue: Until all entries are published or failures stop decreasing

Strategy 3: Selective Publishing with Tags

Tag problematic entries and skip them during publishing. Both cascade-publish and publish-all support --skip-tag:

# 1. Tag all current drafts with 'skip-publish'
npm run tag-drafts skip-publish -- --dry-run  # Preview
npm run tag-drafts skip-publish               # Tag them

# 2a. Use cascade publish (skipping tagged)
npm run cascade-publish -- --skip-tag skip-publish

# OR

# 2b. Use brute force publish (skipping tagged)
npm run publish-all -- --skip-tag skip-publish

# 3. Run repeatedly if using brute force
npm run publish-all -- --skip-tag skip-publish

# 4. When ready, untag and publish the remaining ones
npm run tag-drafts skip-publish -- --remove   # Remove tag
npm run publish-all                            # Publish remaining

Tag Management Commands:

# Tag draft entries
npm run tag-drafts <tag-name>                    # Tag all drafts
npm run tag-drafts <tag-name> -- --dry-run       # Preview tagging
npm run tag-drafts <tag-name> -- --remove        # Remove tag from drafts

# Publishing with tag filtering
npm run cascade-publish -- --skip-tag <tag-name>  # Smart publish, skip tagged
npm run publish-all -- --skip-tag <tag-name>      # Brute force, skip tagged

# Example: Tag with 'circular-dep'
npm run tag-drafts circular-dep
npm run cascade-publish -- --skip-tag circular-dep
# Or use: npm run publish-all -- --skip-tag circular-dep

Use cases:

  • Mark entries known to have circular dependencies
  • Skip problematic entries temporarily
  • Publish clean entries first, handle complex ones later
  • Test publishing on a subset of entries
  • Combine with cascade for efficient dependency-aware publishing

Strategy 4: Asset Publishing First

Publish all assets before entries (assets have no dependencies):

# Preview
npm run publish-assets:dry-run

# Publish all draft assets
npm run publish-assets

Then proceed with entry publishing using one of the strategies above.

Choosing the Right Strategy

Scenario Recommended Strategy
Clean migration, no circular deps Cascade Publish
Known circular dependencies (100-1000 entries) Brute Force Publish
Many circular dependencies (1000+ entries) Selective with Tags (Cascade or Brute Force with --skip-tag)
Want to skip problematic entries Cascade or Brute Force with --skip-tag
Assets only Asset Publishing
Mixed approach Assets β†’ Cascade (skip tagged) β†’ Brute Force for remaining

Publishing Configuration

Create cascade-config.json for publishing scripts:

{
  "spaceId": "your-space-id",
  "environmentId": "master",
  "managementToken": "CFPAT-your-management-token",
  "host": "api.contentful.com"
}

Note: Use api.eu.contentful.com for EU spaces.

Step 7: Space Cleanup (Optional)

Sometimes you need to clean up a space before re-importing or to start fresh. Use the built-in clean-space script for selective cleanup.

Cleanup Scenarios

Scenario 1: Clean Entries Only (Keep Content Types & Assets)

Remove all entries but keep content types and assets:

# Preview first
npm run clean-space:dry-run

# Execute cleanup
npm run clean-space

What gets deleted:

  • βœ… All entries (published + draft)

What stays:

  • ❌ Content types (kept)
  • ❌ Assets (kept)

Use case: Remove all content but keep the content model and assets for fresh import.

Scenario 2: Clean Entries + Content Types (Keep Assets)

Remove entries and content types, but keep assets:

# Preview first
npm run clean-space -- --dry-run --content-types

# Execute cleanup
npm run clean-space -- --content-types

What gets deleted:

  • βœ… All entries
  • βœ… All content types

What stays:

  • ❌ Assets (kept)

Use case: When you want to re-import content model and entries but keep existing assets. Perfect for your EU space cleanup!

Scenario 3: Complete Cleanup (Delete Everything)

Remove everything - entries, content types, AND assets:

# Preview first
npm run clean-space -- --dry-run --content-types --assets

# Execute cleanup
npm run clean-space -- --content-types --assets

What gets deleted:

  • βœ… All entries
  • βœ… All content types
  • βœ… All assets

Use case: Complete fresh start.

Command Options

npm run clean-space [options]

Options:
  --dry-run           Preview what will be deleted (no actual deletion)
  --content-types     Delete content types after entries
  --assets            Delete assets as well
  --batch-size <n>    Number of concurrent operations (default: 10)

Features

  • βœ… Supports EU & US endpoints - Reads from cascade-config.json
  • βœ… Confirmation prompt - Shows space details and asks for Y/N confirmation
  • βœ… Dry-run mode - Preview before deleting
  • βœ… Rate limiting - 10 req/sec to respect API limits
  • βœ… Auto-unpublish - Unpublishes before deletion
  • βœ… Progress tracking - Real-time progress updates
  • βœ… Safe by default - Only deletes entries unless flags are provided

Example Workflows

Workflow 1: Re-import After Failed Migration
# 1. Clean entries + content types (keep assets)
npm run clean-space -- --content-types

# 2. Re-import
npm run import

# 3. Publish
npm run publish-assets
npm run cascade-publish
Workflow 2: Clean Content Only, Keep Model
# 1. Clean only entries (keep model + assets)
npm run clean-space

# 2. Import new content
npm run import
Workflow 3: Complete Fresh Start
# 1. Backup first (optional but recommended)
npx contentful-export \
  --space-id YOUR_SPACE_ID \
  --management-token YOUR_TOKEN \
  --export-dir ./backup

# 2. Clean everything
npm run clean-space -- --content-types --assets

# 3. Import from scratch
npm run import

⚠️ Important Warnings

  1. DESTRUCTIVE OPERATION - Cannot be undone
  2. Test on staging first - Always test cleanup on a non-production environment
  3. Backup before cleanup - Create an export before running cleanup
  4. Check cascade-config.json - Ensure it points to the correct space (EU or US)
  5. Use dry-run first - Always preview with --dry-run before actual deletion
  6. Deletes published content - This removes both draft AND published content

Configuration

The script uses cascade-config.json for space credentials:

{
  "spaceId": "69zmfo9ko3qk",
  "environmentId": "master",
  "managementToken": "CFPAT-your-management-token",
  "host": "api.eu.contentful.com"
}

Make sure this file points to the correct space before running cleanup!

Confirmation Prompt

When running cleanup (not in dry-run mode), you'll see space details and be asked to confirm:

πŸ”— Connecting to Contentful...
βœ… Connected

πŸ“‹ TARGET SPACE DETAILS:
================================================================================
Organization: Your Organization Name
Space Name:   Your Space Name
Space ID:     69zmfo9ko3qk
Environment:  master
API Host:     api.eu.contentful.com
================================================================================

⚠️  WHAT WILL BE DELETED:
================================================================================
Mode: LIVE CLEANUP

  βœ… All entries (published + draft)
  βœ… Content types (will be deleted)
  ❌ Assets (will be kept)
================================================================================

⚠️  WARNING: This operation is DESTRUCTIVE and cannot be undone!
⚠️  Make sure you have a backup before proceeding.

Are you sure you want to proceed? (Y/N):

Type Y or yes to proceed, or N to cancel.

Your Specific Use Case (EU Space)

To clean your EU space (entries + content types, keep assets):

# 1. Verify cascade-config.json points to EU space
cat cascade-config.json

# 2. Preview what will be deleted (no confirmation needed)
npm run clean-space -- --dry-run --content-types

# 3. Execute cleanup (will ask for confirmation)
npm run clean-space -- --content-types
# You'll see space details and must type Y to proceed

# 4. Re-import
npm run import

Resume Failed Import

If import fails or is interrupted:

npm run resume

Automatically detects where to resume and continues.

πŸ“ Project Structure

contentful-batch-migrator/
β”œβ”€β”€ bin/                           # Executable scripts
β”‚   β”œβ”€β”€ rateLimiter.js            # Token bucket rate limiter
β”‚   β”œβ”€β”€ cleanup-drafts.js         # Remove invalid/orphan drafts
β”‚   β”œβ”€β”€ clean-space.js            # Delete entries/content types/assets from space
β”‚   β”œβ”€β”€ split.js                  # Split large exports into batches
β”‚   β”œβ”€β”€ import.js                 # Import batches with rate limiting
β”‚   β”œβ”€β”€ import-cli.js             # Import batches using Contentful CLI
β”‚   β”œβ”€β”€ import-direct.js          # Direct import without batching
β”‚   β”œβ”€β”€ validate.js               # Validate migration success
β”‚   β”œβ”€β”€ resume.js                 # Resume interrupted migrations
β”‚   β”œβ”€β”€ resume-cli.js             # Resume interrupted CLI migrations
β”‚   β”œβ”€β”€ cascade-publish.js        # Smart publish with dependency resolution
β”‚   β”œβ”€β”€ publish-all-drafts.js     # Brute force publish all drafts
β”‚   β”œβ”€β”€ tag-drafts.js             # Tag/untag draft entries
β”‚   β”œβ”€β”€ publish-assets.js         # Publish all draft assets
β”‚   β”œβ”€β”€ strip-validations.js      # Remove content type validations
β”‚   └── restore-validations.js    # Restore content type validations
β”œβ”€β”€ docs/                          # Documentation
β”‚   β”œβ”€β”€ EXPORT-GUIDE.md           # Detailed export instructions
β”‚   β”œβ”€β”€ IMPORT-GUIDE.md           # Detailed import instructions
β”‚   β”œβ”€β”€ RATE-LIMITING.md          # Rate limiting details
β”‚   └── TROUBLESHOOTING.md        # Common issues and solutions
β”œβ”€β”€ batch-config.json              # Batch import configuration
β”œβ”€β”€ batch-config.example.json      # Batch import config template
β”œβ”€β”€ cascade-config.json            # Publishing configuration
β”œβ”€β”€ cascade-config.example.json    # Publishing config template
β”œβ”€β”€ package.json                   # Dependencies and scripts
β”œβ”€β”€ README.md                      # This file
β”œβ”€β”€ CONTRIBUTING.md                # Contribution guidelines
β”œβ”€β”€ LICENSE                        # MIT License
└── contentful-export/             # Your exported data (not in repo)
    β”œβ”€β”€ exported-space.json
    └── [asset directories]

🎬 Example Migrations

Example 1: Large-Scale Migration with Circular Dependencies

Scenario: Migrate 10,000+ assets (12.6GB) and 25,000+ entries with 4,000+ circular dependencies

# 1. Export from US space
npx contentful-export \
  --space-id us-space-123 \
  --management-token US_TOKEN \
  --export-dir ./contentful-export \
  --download-assets

# 2. Clean invalid drafts (recommended)
npm run cleanup-drafts
# Cleaned 2,000+ draft entries

# 3. Strip validations
npm run strip-validations
# Stripped 187 validations from 52 content types

# 4. Configure target (EU space)
cp batch-config.example.json batch-config.json
# Edit batch-config.json with EU space credentials
# Set "skipContentPublishing": true

# 5. Import using CLI (assets first)
npm run import:cli
# Takes ~4-5 hours for 10,000+ assets

# 6. Validate
npm run validate
# All checks pass βœ…

# 7. Publish assets
npm run publish-assets
# ~17 minutes for 10,000+ assets

# 8. Publish entries with cascade
npm run cascade-publish
# ~42 minutes for 21,000 entries (4,000 skipped due to circular deps)

# 9. Brute force publish circular dependencies
npm run publish-all  # Run 8-12 times until complete
# ~80 minutes total for 4,000 entries

# 10. Restore validations
npm run restore-validations
# Restored 187 validations to 52 content types

Result: Successfully migrated and published 35,000+ items in ~12 hours!

Example 2: Using Custom Import Script for Better Control

Scenario: Same migration but using custom script with advanced rate limiting

# 1-3. Same as Example 1 (export, clean drafts, strip validations)

# 4. Configure for custom script
cp batch-config.example.json batch-config.json
# Enable rate limiting:
# "rateLimits": { "enabled": true, "requestsPerSecond": 10 }

# 5. Split into batches
npm run split
# Output: Created 26 batches (400 assets each)

# 6. Import with custom script (all-in-one: assets + entries)
npm run import
# Takes ~8-10 hours with built-in rate limiting and state tracking

# 7. Validate
npm run validate
# All checks pass βœ…

# 8-10. Same publishing steps as Example 1
npm run publish-assets
npm run cascade-publish
npm run publish-all  # Repeat 8-12 times
npm run restore-validations

Result: Same successful migration with more granular control and better resume capability!

Example 3: Selective Publishing with Tags

Scenario: Mark problematic entries and publish clean ones first

# 1-6. Same as Example 1 (export, clean, import, validate)

# 7. Identify and tag problematic entries
# After investigating failed-entries.json from a test run
# Manually tag ~500 problematic entries in Contentful UI with 'skip-publish'

# 8. Publish assets
npm run publish-assets

# 9. Cascade publish (skipping tagged - respects dependencies)
npm run cascade-publish -- --skip-tag skip-publish
# Published 20,500 entries in dependency order
# Skipped 500 tagged + 4,000 circular ones

# 10. Brute force remaining circular deps (except tagged)
npm run publish-all -- --skip-tag skip-publish
# Run 1: Published 50, Failed 35
npm run publish-all -- --skip-tag skip-publish
# Run 2: Published 30, Failed 5
npm run publish-all -- --skip-tag skip-publish
# Run 3: Published 5, Failed 0

# 11. Eventually handle the 500 tagged ones
npm run tag-drafts skip-publish -- --remove
npm run publish-all
# Run several times until complete

Result: Clean entries published efficiently with cascade, circular deps resolved with brute force, problematic ones handled separately!

πŸ› Troubleshooting

Rate Limiting (429 Errors)

Solution: Increase delay between batches

{
  "importOptions": {
    "delayBetweenBatches": 300000  // 5 minutes instead of 3
  }
}

Import Failures Due to Invalid Drafts

Solution: Clean invalid drafts before importing

npm run cleanup-drafts
# Review draft-cleanup-report.json
# Update batch-config.json to use cleaned file

Import Failures

  1. Check logs: batches/logs/batch-XX-errors.log
  2. Resume import: npm run resume
  3. If persists, reduce batch size

Validation Mismatches

  1. Check failed batches: batches/import-state.json
  2. Review error logs
  3. Retry failed batches: npm run resume

Need to Start Fresh or Re-import

Problem: Migration is too broken to fix, or you want to start over

Solution: Use the built-in space cleanup script

# Clean everything except assets (fastest way to retry)
npm run clean-space -- --content-types

# Then re-import
npm run import

When to use:

  • Import created corrupted data
  • Want to test different import strategies
  • Content model changes require fresh import
  • Migration failed multiple times and recovery is too complex

Features:

  • βœ… Supports EU & US API endpoints (reads from cascade-config.json)
  • βœ… Shows space details before deletion (org, space name, space ID, environment)
  • βœ… Requires Y/N confirmation prompt
  • βœ… Dry-run mode for safety
  • βœ… Selective cleanup (entries, content types, assets)
  • βœ… Auto-unpublish before deletion

See: Step 7 in Usage section for detailed cleanup options

Publishing Issues

Circular Dependencies Won't Resolve

Problem: publish-all keeps failing with same entries after many runs

Solutions:

# 1. Check failed-entries.json for patterns
cat failed-entries.json | grep "error" | sort | uniq -c

# 2. Tag problematic entries and skip them temporarily
npm run tag-drafts circular-dep
npm run publish-all -- --skip-tag circular-dep

# 3. Manually investigate and fix in Contentful UI
# - Break circular references
# - Publish dependencies manually
# - Then retry: npm run publish-all

All Entries Failing to Publish

Problem: 100% failure rate in publish-all

Possible causes:

  1. Assets not published yet β†’ Run npm run publish-assets first
  2. Wrong configuration β†’ Verify cascade-config.json credentials
  3. Permissions issue β†’ Check management token has publish permissions
  4. Network/API issues β†’ Wait and retry

Tags Not Working

Problem: tag-drafts fails or tags not appearing

Solution:

# 1. Verify the tag was created
# Check in Contentful UI β†’ Settings β†’ Tags

# 2. Ensure tag ID doesn't have special characters
# Use simple names: skip-publish, circular-dep, problematic

# 3. Check metadata in Contentful API
# Tags should appear in entry.metadata.tags

Publishing Too Slow

Problem: Publishing takes hours

Solutions:

  1. Skip analysis: Use publish-all instead of cascade-publish (no dependency analysis overhead)
  2. Filter by tag: Tag and skip entries you know will fail
  3. Parallel runs: If you have multiple environments, publish them in parallel
  4. Rate limiting: The 100ms delay (10 req/sec) is safe but you can reduce it to 50ms (20 req/sec) in the code if needed

πŸ“š Full guide: docs/TROUBLESHOOTING.md

πŸ§ͺ Testing

Test with a small batch first:

{
  "batchSize": 100  // Small batch for testing
}

Then monitor the first batch import closely before proceeding with full migration.

🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for details.

Quick Contribution Guide

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support

πŸ”— Related Projects

πŸ“Š Stats & Performance

Tested with:

  • βœ… 10,000+ assets (12.6GB)
  • βœ… 25,000+ content entries
  • βœ… 2,000+ draft entries (cleaned before migration)
  • βœ… 50+ content types with 187 validations
  • βœ… 700+ tags
  • βœ… 4,000+ circular dependency entries

Import Performance:

  • Average batch import: 20-30 minutes per batch (with rate limiting)
  • Full migration (26 batches): 8-10 hours (with custom script + rate limiting)
  • Assets-only import (CLI): 4-5 hours for 10,000+ assets (12.6GB)
  • Content-only import: 2-3 hours for 25,000+ entries
  • Success rate: 100% (with retries)

Publishing Performance:

  • Asset publishing: ~17 minutes for 10,000+ assets (10 req/sec)
  • Cascade publish: ~42 minutes for 21,000 entries (10 req/sec)
  • Brute force publish: 8-12 iterations for 4,000+ circular dependencies (~10 min per iteration, ~80 min total)
  • Tagging: ~7 minutes per 1,000 entries (10 req/sec)

Overall Migration Time:

  • Method A (CLI Import): 11-12 hours total
  • Method B (Custom Script): 13-14 hours total

πŸ—ΊοΈ Roadmap

  • Client-side rate limiting - Token bucket algorithm to respect API limits
  • Draft cleanup utility - Identify and remove invalid/orphan drafts before migration
  • Cascade publish - Smart publishing with dependency resolution
  • Brute force publish - Handle circular dependencies through repetition
  • Tag management - Tag and filter draft entries for selective publishing
  • Webhook integration - Trigger notifications on migration completion
  • Parallel batch imports - Import multiple batches simultaneously
  • Incremental migrations - Sync only changed content

⚠️ Important Notes

  1. Management Token: Keep your CMA token secure, never commit it
  2. Test First: Always test on a staging environment
  3. Backup: Create a space snapshot before importing
  4. Rate Limits: Respect Contentful's API rate limits (10 req/sec, 36K req/hour)
  5. Asset Files: Ensure all asset files are downloaded locally
  6. Publishing Configuration: Create cascade-config.json for publishing scripts (separate from batch-config.json)
  7. Circular Dependencies: Use brute force publish for entries with circular references
  8. Draft Publishing: Always set skipContentPublishing: true during import, then publish separately
  9. Space Cleanup: Use npm run clean-space for selective cleanup (supports EU & US endpoints)
  10. Cleanup is Destructive: Space cleanup operations cannot be undone - always use dry-run first

πŸ“‹ Quick Reference

Space Cleanup Commands

# Clean entries only (keep content types & assets)
npm run clean-space:dry-run
npm run clean-space

# Clean entries + content types (keep assets)
npm run clean-space -- --dry-run --content-types
npm run clean-space -- --content-types

# Clean everything (entries + content types + assets)
npm run clean-space -- --content-types --assets

Publishing Commands

Basic Publishing

# Publish assets (always do this first)
npm run publish-assets

# Publish entries with smart dependency resolution
npm run cascade-publish

# Cascade publish with tag filtering
npm run cascade-publish -- --skip-tag skip-publish

# Preview before publishing (dry run)
npm run cascade-publish:dry-run

For Circular Dependencies

# Brute force - run repeatedly until all published
npm run publish-all
npm run publish-all  # Run multiple times

# Check progress
cat failed-entries.json | wc -l  # Count remaining failures

Tag Management

# Tag all drafts
npm run tag-drafts skip-publish

# Publish everything except tagged
npm run publish-all -- --skip-tag skip-publish

# Remove tag when ready
npm run tag-drafts skip-publish -- --remove
npm run publish-all  # Publish the previously tagged ones

Common Workflows

Publishing Workflows

# Workflow 1: Standard (no circular deps)
npm run publish-assets && npm run cascade-publish

# Workflow 2: With circular deps
npm run publish-assets
npm run cascade-publish
# Some will be skipped, use brute force for remaining
npm run publish-all  # Repeat until failures = 0

# Workflow 3: Selective with cascade (tag problematic ones first)
npm run tag-drafts problematic
npm run publish-assets
npm run cascade-publish -- --skip-tag problematic
# Clean entries published, now handle problematic ones
npm run tag-drafts problematic -- --remove
npm run publish-all  # Brute force for remaining

# Workflow 4: Selective with brute force only
npm run tag-drafts problematic
npm run publish-assets
npm run publish-all -- --skip-tag problematic
# Handle problematic ones separately later

Cleanup & Re-import Workflows

# Workflow 5: Clean and re-import (keep assets)
npm run clean-space -- --content-types
npm run import
npm run publish-assets
npm run cascade-publish

# Workflow 6: Complete fresh start
npm run clean-space -- --content-types --assets
npm run import
npm run publish-all

# Workflow 7: Clean content only, keep model
npm run clean-space
npm run import:direct  # Import without splitting

Made with ❀️ for the Contentful community

If this tool helped you, please ⭐ star the repo!

About

πŸš€ Batch migration tool for large Contentful spaces. Avoid rate limits by splitting exports into manageable batches. Handles 100k+ assets effortlessly.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors