The single most important rule for deploying any database on AWS is: never put database data on the OS disk. Every AWS Marketplace database product — MongoDB, PostgreSQL, MySQL, Elasticsearch — follows this pattern. The OS root volume is ephemeral and tied to the instance lifecycle. A separate EBS data volume is a standalone resource that survives instance termination, replacement, and failure.
This template applies that principle to Neo4j Community Edition.
By default, if the database data lives on the root EBS volume and the ASG replaces the instance — whether from a health check failure or underlying hardware failure — the data is lost. The ASG terminates the old instance, which deletes its root volume, then launches a fresh instance from the AMI with a blank disk. The ASG has no mechanism to preserve the old root volume.
With a separate data volume, the EBS volume is a standalone CloudFormation resource that the ASG has no knowledge of and cannot delete. When the new instance boots, it attaches and mounts the existing data volume, picking up exactly where the old instance left off. Without that separation, an ASG "self-healing" a failed instance would also destroy the database it was trying to protect.
┌─────────────────────────────────────────────────┐
│ VPC (10.0.0.0/16) │
│ │
│ ┌────────────────────────────────────────────┐ │
│ │ Public Subnet (10.0.1.0/24) │ │
│ │ │ │
│ │ ┌──────────────────────────────────────┐ │ │
│ │ │ Auto Scaling Group (size: 1) │ │ │
│ │ │ │ │ │
│ │ │ ┌────────────────────────────────┐ │ │ │
│ │ │ │ EC2 Instance │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ Root EBS (OS + Neo4j binary) │ │ │ │
│ │ │ │ /dev/xvda — destroyed with │ │ │ │
│ │ │ │ instance │ │ │ │
│ │ │ └────────────────────────────────┘ │ │ │
│ │ └──────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ │ attached at boot │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────────────┐ │ │
│ │ │ Data EBS Volume (GP3, encrypted) │ │ │
│ │ │ /data — RETAINED on deletion │ │ │
│ │ │ Survives instance replacement │ │ │
│ │ └──────────────────────────────────────┘ │ │
│ │ │ │
│ └────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌────────────────────────┐ │
│ │ Elastic IP │ │ Internet Gateway │ │
│ │ (stable) │ │ │ │
│ └──────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────┘
| Concern | Root disk | Separate EBS volume |
|---|---|---|
| Instance termination | Data lost | Data preserved |
| ASG replacement | Data lost | Volume re-attached to new instance |
| Disk sizing | Competing with OS for space | Sized independently for the database |
| Snapshots | Includes OS noise | Clean database-only backups |
| Encryption | Mixed concern | Dedicated encryption context |
| Performance tuning | Shared IOPS with OS | Dedicated IOPS for database I/O |
This is not unique to Neo4j. MongoDB, PostgreSQL, MySQL, and Elasticsearch Marketplace offerings all use the same separation. The consistent theme across all of them: never put database data on the OS disk.
AWS Marketplace recommends an Auto Scaling Group even for single-instance products. The ASG monitors EC2 health checks and automatically replaces a failed instance. On replacement, the new instance re-attaches the data volume and re-associates the Elastic IP, restoring service without manual intervention.
A stable public address that follows the instance across ASG replacements. Clients connect to the same IP regardless of which underlying EC2 instance is running. The EIP is associated by UserData on every boot.
Nitro-based instances expose EBS volumes as NVMe devices with non-deterministic names. The UserData script matches the EBS volume serial number against /dev/disk/by-id/ entries to find the correct device, rather than assuming a fixed device path like /dev/nvme1n1.
The script checks for an existing filesystem with blkid before formatting. On first boot, the volume is blank and gets an ext4 filesystem. On subsequent boots (after ASG replacement), the existing filesystem and data are preserved.
GP3 provides a baseline of 3000 IOPS and 125 MB/s throughput at lower cost than GP2. Encryption at rest is enabled by default — a baseline requirement for any database volume.
- Install Neo4j Community Edition from
yum.neo4j.com - Wait for data volume to become available (may be detaching from a terminated instance)
- Attach data volume to this instance
- Resolve the NVMe device path
- Format the volume (first boot only)
- Mount at
/data - Associate the Elastic IP
- Install APOC plugin (if enabled)
- Configure Neo4j (network, memory, security)
- Set admin password (first boot only)
- Start Neo4j
- Signal CloudFormation success
| Product | Data volume | Resilience | Stable endpoint |
|---|---|---|---|
| Neo4j CE (this template) | Separate EBS, retained | ASG of 1 | Elastic IP |
| MongoDB Community | Separate EBS | ASG of 1 | Elastic IP |
| PostgreSQL | Separate EBS for /var/lib/pgsql |
ASG of 1 | Elastic IP or ENI |
| MySQL | Separate EBS for /var/lib/mysql |
ASG of 1 | Elastic IP or ENI |
| Elasticsearch | Separate EBS for data path | ASG of 1 | Elastic IP |
The implementation details vary, but the architecture is the same: compute is disposable, data is persistent, and the two are on separate volumes.