|
| 1 | +# Architecture |
| 2 | + |
| 3 | +## Current architecture - Overview |
| 4 | + |
| 5 | +```mermaid |
| 6 | +flowchart LR |
| 7 | +
|
| 8 | +subgraph UL["User laptop"] |
| 9 | + HPCSC["HPCS Client"] |
| 10 | + ULSPA["Spire Agent"] |
| 11 | +end |
| 12 | +
|
| 13 | +subgraph SS["Supercomputing Site"] |
| 14 | + subgraph CN["Compute node"] |
| 15 | + CNSPA["Spire Agent"] |
| 16 | + SBATCH["Sbatch"] |
| 17 | + end |
| 18 | + LN["Login node"] |
| 19 | +end |
| 20 | +
|
| 21 | +subgraph UN["Utility node"] |
| 22 | + subgraph k8s["Kubernetes cluster"] |
| 23 | + SPS["Spire Server"] |
| 24 | + HPCSS["HPCS Server"] |
| 25 | + SPA["Spire Agent"] |
| 26 | + Vault |
| 27 | + end |
| 28 | +end |
| 29 | +
|
| 30 | +UL <--"SSH"--> LN |
| 31 | +LN <--"Scheduling"--> CN |
| 32 | +UL <--"HTTPS (HPCS), HTTPS (Vault), TCP (Spire)"--> UN |
| 33 | +CN <--"HTTPS (HPCS), HTTPS (Vault), TCP (Spire)"--> UN |
| 34 | +``` |
| 35 | + |
| 36 | +## Current architecture - In depth |
| 37 | + |
| 38 | +```mermaid |
| 39 | +flowchart LR |
| 40 | +
|
| 41 | +subgraph UL["User laptop"] |
| 42 | +
|
| 43 | + subgraph HPCSCDP["Data preparation"] |
| 44 | + HPCSCDPB["HPCS Client"] |
| 45 | + SPADP["Spire Agent"] |
| 46 | + end |
| 47 | +
|
| 48 | + subgraph HPCSCCP["Container preparation"] |
| 49 | + HPCSCCPB["HPCS Client"] |
| 50 | + SPACP["Spire Agent"] |
| 51 | + end |
| 52 | + subgraph HPCSCJP["HPCS Client - Job preparation"] |
| 53 | + HPCSCJPB["HPCS Client"] |
| 54 | + end |
| 55 | +end |
| 56 | +
|
| 57 | +subgraph SS["Supercomputing Site"] |
| 58 | + SC["Slurm Controller"] |
| 59 | + LN["Login nodes"] |
| 60 | + subgraph PCPU["CPU Partition"] |
| 61 | + subgraph CN1["Compute node 1"] |
| 62 | + CN1SBATCH["Sbatch"] |
| 63 | + CN1SA["Spire Agent"] |
| 64 | + end |
| 65 | + subgraph CN2["Compute node 2"] |
| 66 | + CN2SBATCH["Sbatch"] |
| 67 | + CN2SA["Spire Agent"] |
| 68 | + end |
| 69 | + end |
| 70 | + subgraph PGPU["GPU Partition"] |
| 71 | + subgraph CN3["Compute node 3"] |
| 72 | + CN3SBATCH["Sbatch"] |
| 73 | + CN3SA["Spire Agent"] |
| 74 | + end |
| 75 | + subgraph CN4["Compute node 4"] |
| 76 | + CN4SBATCH["Sbatch"] |
| 77 | + CN4SA["Spire Agent"] |
| 78 | + end |
| 79 | + end |
| 80 | +end |
| 81 | +
|
| 82 | +subgraph UN["Utility node"] |
| 83 | + subgraph k8s["Kubernetes cluster"] |
| 84 | + subgraph HPCSP["HPCS Pod"] |
| 85 | + SPS["Spire Server"] |
| 86 | + subgraph HPCSSC["HPCS Server Container"] |
| 87 | + HPCSS["HPCS Server"] |
| 88 | + SPA["Spire Agent"] |
| 89 | + end |
| 90 | + SPO["Spire OIDC"] |
| 91 | + NI["Nginx Ingress"] |
| 92 | + end |
| 93 | + Vault |
| 94 | + end |
| 95 | +end |
| 96 | +
|
| 97 | +SPS <--"UNIX Socket"--> SPO |
| 98 | +SPO <--"UNIX Socket"--> NI |
| 99 | +
|
| 100 | +HPCSS <--"CLI + UNIX Socket"--> SPS |
| 101 | +HPCSS <--"PYSPIFFE (UNIX SOCKET)"--> SPA |
| 102 | +
|
| 103 | +SPA <--TCP--> SPS |
| 104 | +
|
| 105 | +Vault <--"HTTPS"--> NI |
| 106 | +Vault <--"HTTPS (mTLS)"--> HPCSS |
| 107 | +
|
| 108 | +LN <--"CLI"--> SC |
| 109 | +
|
| 110 | +SC <--"Scheduling"--> PCPU |
| 111 | +SC <--"Scheduling"--> PGPU |
| 112 | +
|
| 113 | +SPADP <--"TCP"--> SPS |
| 114 | +SPACP <--"TCP"--> SPS |
| 115 | +
|
| 116 | +HPCSCDPB <--"HTTPS (mTLS)"--> HPCSS |
| 117 | +HPCSCCPB <--"HTTPS (mTLS)"--> HPCSS |
| 118 | +
|
| 119 | +HPCSCDPB <--"HTTPS"--> Vault |
| 120 | +HPCSCCPB <--"HTTPS"--> Vault |
| 121 | +
|
| 122 | +HPCSCCPB <--"CLI/Lib + UNIX Socket"--> SPACP |
| 123 | +HPCSCDPB <--"CLI/Lib + UNIX Socket"--> SPADP |
| 124 | +
|
| 125 | +CN1SA <--"TCP"--> SPS |
| 126 | +CN2SA <--"TCP"--> SPS |
| 127 | +CN3SA <--"TCP"--> SPS |
| 128 | +CN4SA <--"TCP"--> SPS |
| 129 | +
|
| 130 | +CN1SBATCH <--"HTTPS"--> Vault |
| 131 | +CN2SBATCH <--"HTTPS"--> Vault |
| 132 | +CN3SBATCH <--"HTTPS"--> Vault |
| 133 | +CN4SBATCH <--"HTTPS"--> Vault |
| 134 | +
|
| 135 | +HPCSCDPB <--"SSH (As user - Data & Info files)"--> LN |
| 136 | +HPCSCCPB <--"SSH (As user - Container image & Info files)"--> LN |
| 137 | +
|
| 138 | +HPCSCJPB --"SSH (As user - SBATCH file & CLI Call to SBATCH)"--> LN |
| 139 | +LN --"SSH (As user - Info files)"--> HPCSCJPB |
| 140 | +``` |
| 141 | + |
| 142 | +This diagram doesn't show the HTTPS requests from client/compute node to HPCS Server used to register the agents since this behaviour is a practical workaround. |
0 commit comments