eth-cscs · bcumming · Jun 2, 2025 · Apr 23, 2025 · Apr 25, 2025 · Apr 28, 2025
@@ -7,3 +7,5 @@ docs/software/prgenv/linalg.md @finkandreas @msimberg
 docs/software/sciapps/cp2k.md @abussy @RMeli
 docs/software/sciapps/gromacs.md @kanduri
 docs/software/ml @boeschf
+docs/storage @mpasserini
+docs/alps/storage.md @mpasserini
@@ -1,13 +1,15 @@
 [](){#ref-alps-storage}
 # Alps Storage
 
+!!! under-construction
+
 Alps has different storage attached, each with characteristics suited to different workloads and use cases.
 HPC storage is managed in a separate cluster of nodes that host servers that manage the storage and the physical storage drives.
-These separate clusters are on the same Slingshot 11 network as the Alps.
+These separate storage clusters are on the same Slingshot 11 network as Alps.
 
-|              | Capstor                | Iopsstor               | Vast                |
+|              | Capstor                | Iopsstor               | VAST                |
 |--------------|------------------------|------------------------|---------------------|
-| Model        | HPE ClusterStor E1000D | HPE ClusterStor E1000F | Vast                |
+| Model        | HPE ClusterStor E1000D | HPE ClusterStor E1000F | VAST                |
 | Type         | Lustre                 | Lustre                 | NFS                 |
 | Capacity     | 129 PB raw GridRAID    | 7.2 PB raw RAID 10     | 1 PB                |
 | Number of Drives | 8,480 16 TB HDD    | 240 * 30 TB NVMe SSD   | N/A                 |
@@ -16,25 +18,48 @@ These separate clusters are on the same Slingshot 11 network as the Alps.
 | IOPs         | 1.5M                   | 8.6M read, 24M write   | 200k read, 768k write |
 | file create/s| 374k                   | 214k                   | 97k                 |
 
+
+!!! todo
+    Information about Lustre. Meta data servers, etc.
+
+    * how many meta data servers on Capstor and Iopsstor
+    * how these are distributed between store/scratch
+
+    Also discuss how Capstor and iopstor are used to provide both scratch / store / other file systems
+
+The mounts, and how they are used for Scratch, Store, and Home file systems that are mounted on clusters are documented in the [file system docs][ref-storage-fs].
+
 [](){#ref-alps-capstor}
-## capstor
+## Capstor
 
 Capstor is the largest file system, for storing large amounts of input and output data.
-It is used to provide SCRATCH and STORE for different clusters - the precise details are platform-specific.
+It is used to provide [scratch][ref-storage-scratch] and [store][ref-storage-store].
+
+!!! todo "add information about meta data services, and their distribution over scratch and store"
+
+[](){#ref-alps-capstor-scratch}
+### Scratch
+
+All users on Alps get their own scratch path on Alps, `/capstor/scratch/cscs/$USER`.
+
+[](){#ref-alps-capstor-store}
+### Store
+
+The [Store][ref-storage-store] mount point on Capstor provides stable storage with [backups][ref-storage-backups] and no [cleaning policy][ref-storage-cleanup].
+It is mounted on clusters at the `/capstor/store` mount point, with folders created for each project.
 
 [](){#ref-alps-iopsstor}
-## iopsstor
+## Iopsstor
 
 !!! todo
-    small text explaining what iopsstor is designed to be used for.
+    small text explaining what Iopsstor is designed to be used for.
 
 [](){#ref-alps-vast}
-## vast
+## VAST
 
-The Vast storage is smaller capacity system that is designed for use as home folders.
+The VAST storage is smaller capacity system that is designed for use as [Home][ref-storage-home] folders.
 
 !!! todo
-    small text explaining what iopsstor is designed to be used for.
+    small text explaining what Iopsstor is designed to be used for.
 
-The mounts, and how they are used for SCRATCH, STORE, PROJECT, HOME would be in the [storage docs][ref-storage-fs]
 
@@ -9,7 +9,7 @@ The service is accessed at [jupyter-daint.cscs.ch](https://jupyter-daint.cscs.c
 
 Once logged in, you will be redirected to the JupyterHub Spawner Options form, where typical job configuration options can be selected in order to allocate resources. These options might include the type and number of compute nodes, the wall time limit, and your project account.
 
-Single-node notebooks are launched in a dedicated queue, minimizing queueing time. For these notebooks, servers should be up and running within a few minutes. The maximum waiting time for a server to be running is 5 minutes, after which the job will be cancelled and you will be redirected back to the spawner options page. If your single-node server is not spawned within 5 minutes we encourage you to [contact us](ref-get-in-touch).
+Single-node notebooks are launched in a dedicated queue, minimizing queueing time. For these notebooks, servers should be up and running within a few minutes. The maximum waiting time for a server to be running is 5 minutes, after which the job will be cancelled and you will be redirected back to the spawner options page. If your single-node server is not spawned within 5 minutes we encourage you to [contact us][ref-get-in-touch].
 
 When resources are granted the page redirects to the JupyterLab session, where you can browse, open and execute notebooks on the compute nodes. A new notebook with a Python 3 kernel can be created with the menu `new` and then `Python 3` . Under `new` it is also possible to create new text files and folders, as well as to open a terminal session on the allocated compute node.