|
| 1 | +# Overview and Architecture |
| 2 | + |
| 3 | +The series of docs in this directory define how to create your own DANDI ecosystem (i.e. a clone of the entire DANDI ecosystem). |
| 4 | +It is suggested that you briefly read through each of the documents in this guide before starting. |
| 5 | + |
| 6 | +This section provides a high-level view of how DANDI’s core components fit together in a typical “full stack” deployment. |
| 7 | + |
| 8 | +## The Big Picture |
| 9 | + |
| 10 | +The DANDI platform is essentially composed of: |
| 11 | + |
| 12 | +1. **Storage**: S3 buckets (AWS) where data actually resides. |
| 13 | +2. **API**: A Django/Resonant-based backend application (hosted on Heroku) that handles the DANDI data model, user authentication, and orchestrates S3 interactions. |
| 14 | +3. **Frontend**: A Vue-based web application (hosted on Netlify) for users to browse, search, and manage data in the archive. |
| 15 | +4. **Workers**: Celery workers (also on Heroku) for asynchronous tasks such as file checksum calculations, analytics, and housekeeping. |
| 16 | +5. **Observability**: Log aggregation and alerting (Heroku logs, optional additional logs), plus Sentry for error-tracking and notifications. |
| 17 | +6. **Infrastructure-As-Code**: Terraform scripts that glue everything together—AWS (S3, Route53, etc), Netlify, Heroku, etc. |
| 18 | + |
| 19 | +These services interconnect as follows: |
| 20 | + |
| 21 | +<img |
| 22 | +src="../img/client_requests.jpg" |
| 23 | +alt="client_requests" |
| 24 | +style="width: 90%; height: auto; display: block; margin-left: auto; margin-right: auto;"/> |
| 25 | + |
| 26 | +* The user (or script) interacts with the **Web UI** or the **DANDI CLI**. |
| 27 | +* The **Web UI** calls into the **API** (over HTTPS). |
| 28 | +* The **API** queries or updates metadata in its Postgres DB (hosted on Heroku). |
| 29 | +* The **API** calls AWS S3 to read/write DANDI assets. |
| 30 | +* Certain heavy-lift or background tasks get queued into Celery tasks, handled by the **Workers**. |
| 31 | +* Domain names, certificates, and load-balancing records are handled by AWS Route 53 or Netlify’s DNS, depending on whether it’s the API subdomain or the apex domain for the UI. |
| 32 | +* Large chunks of data can be streamed from S3 directly to the Client via presigned URLs |
| 33 | + |
| 34 | +## Key Components |
| 35 | + |
| 36 | +<img |
| 37 | +src="../img/deployment.jpg" |
| 38 | +alt="dandi_deployment" |
| 39 | +style="width: 90%; height: auto; display: block; margin-left: auto; margin-right: auto;"/> |
| 40 | + |
| 41 | + |
| 42 | +### 1. AWS S3 Storage |
| 43 | + |
| 44 | +* **Primary Storage**: S3 buckets are the primary storage of the data (Zarr, NWB, etc.). |
| 45 | +* **Configured via terraform**: Bucket creation, IAM policies, route to logs, etc., are specified in `terraform/*.tf`. |
| 46 | +Provides storage buckets, as well as domain management, for resources across the DANDI ecosystem |
| 47 | + |
| 48 | +### 2. Heroku |
| 49 | + |
| 50 | +Provisions the servers, worker processes, and the database for the API. |
| 51 | + |
| 52 | +1. **API**: Django, extended by [Resonant](https://github.com/kitware-resonant/terraform-heroku-resonant), provides REST endpoints for metadata, asset management, versioning, and authentication. |
| 53 | +2. **Postgres**: Stores user metadata, dandiset metadata, and references to S3 objects. |
| 54 | +3. **Workers (Celery)**: Offload long-running tasks (checksums, analytics, zarr validation, etc.). |
| 55 | + |
| 56 | +### 3. Netlify (UI) |
| 57 | + |
| 58 | +* **Frontend server**: Serves a static build of the DANDI Archive frontend (Vue.js). |
| 59 | +* **Autodeployment**: On each push or merge to `main` (or whichever branch is configured), Netlify automatically builds and deploys. |
| 60 | +* **Configuration**: |
| 61 | + - **`netlify.toml`**: Describes build commands, environment variables for staging vs. production. |
| 62 | + - **`.env.production`**: Holds the environment variables for the Vue-based app at runtime (e.g. `VITE_API_URL`, `VITE_SENTRY_DSN`). |
| 63 | + |
| 64 | +### 4. Terraform Infrastructure |
| 65 | + |
| 66 | +The single source of truth for spinning up or tearing down resources such as S3 buckets, IAM users, Route 53 DNS, Heroku pipeline config, Netlify domain config, etc. |
| 67 | + |
| 68 | +* **Repo**: The [`dandi-infrastructure`](https://github.com/dandi/dandi-infrastructure) repo. |
| 69 | +* **Terraform Cloud**: Used to plan or apply changes after you push commits to the infrastructure repo. |
0 commit comments