Skip to content

Commit dd3c160

Browse files
committed
architecture diagram overview and in depth
1 parent c6c3a22 commit dd3c160

File tree

2 files changed

+149
-0
lines changed

2 files changed

+149
-0
lines changed

docs/README.md

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Documentation
2+
3+
In-depth documentations are available in this directory.
4+
5+
## Architecture
6+
7+
See [architecture documentation](https://github.com/CSCfi/HPCS/tree/doc/readme_and_sequence_diagrams/docs/architecture.md).

docs/architecture.md

+142
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
# Architecture
2+
3+
## Current architecture - Overview
4+
5+
```mermaid
6+
flowchart LR
7+
8+
subgraph UL["User laptop"]
9+
HPCSC["HPCS Client"]
10+
ULSPA["Spire Agent"]
11+
end
12+
13+
subgraph SS["Supercomputing Site"]
14+
subgraph CN["Compute node"]
15+
CNSPA["Spire Agent"]
16+
SBATCH["Sbatch"]
17+
end
18+
LN["Login node"]
19+
end
20+
21+
subgraph UN["Utility node"]
22+
subgraph k8s["Kubernetes cluster"]
23+
SPS["Spire Server"]
24+
HPCSS["HPCS Server"]
25+
SPA["Spire Agent"]
26+
Vault
27+
end
28+
end
29+
30+
UL <--"SSH"--> LN
31+
LN <--"Scheduling"--> CN
32+
UL <--"HTTPS (HPCS), HTTPS (Vault), TCP (Spire)"--> UN
33+
CN <--"HTTPS (HPCS), HTTPS (Vault), TCP (Spire)"--> UN
34+
```
35+
36+
## Current architecture - In depth
37+
38+
```mermaid
39+
flowchart LR
40+
41+
subgraph UL["User laptop"]
42+
43+
subgraph HPCSCDP["Data preparation"]
44+
HPCSCDPB["HPCS Client"]
45+
SPADP["Spire Agent"]
46+
end
47+
48+
subgraph HPCSCCP["Container preparation"]
49+
HPCSCCPB["HPCS Client"]
50+
SPACP["Spire Agent"]
51+
end
52+
subgraph HPCSCJP["HPCS Client - Job preparation"]
53+
HPCSCJPB["HPCS Client"]
54+
end
55+
end
56+
57+
subgraph SS["Supercomputing Site"]
58+
SC["Slurm Controller"]
59+
LN["Login nodes"]
60+
subgraph PCPU["CPU Partition"]
61+
subgraph CN1["Compute node 1"]
62+
CN1SBATCH["Sbatch"]
63+
CN1SA["Spire Agent"]
64+
end
65+
subgraph CN2["Compute node 2"]
66+
CN2SBATCH["Sbatch"]
67+
CN2SA["Spire Agent"]
68+
end
69+
end
70+
subgraph PGPU["GPU Partition"]
71+
subgraph CN3["Compute node 3"]
72+
CN3SBATCH["Sbatch"]
73+
CN3SA["Spire Agent"]
74+
end
75+
subgraph CN4["Compute node 4"]
76+
CN4SBATCH["Sbatch"]
77+
CN4SA["Spire Agent"]
78+
end
79+
end
80+
end
81+
82+
subgraph UN["Utility node"]
83+
subgraph k8s["Kubernetes cluster"]
84+
subgraph HPCSP["HPCS Pod"]
85+
SPS["Spire Server"]
86+
subgraph HPCSSC["HPCS Server Container"]
87+
HPCSS["HPCS Server"]
88+
SPA["Spire Agent"]
89+
end
90+
SPO["Spire OIDC"]
91+
NI["Nginx Ingress"]
92+
end
93+
Vault
94+
end
95+
end
96+
97+
SPS <--"UNIX Socket"--> SPO
98+
SPO <--"UNIX Socket"--> NI
99+
100+
HPCSS <--"CLI + UNIX Socket"--> SPS
101+
HPCSS <--"PYSPIFFE (UNIX SOCKET)"--> SPA
102+
103+
SPA <--TCP--> SPS
104+
105+
Vault <--"HTTPS"--> NI
106+
Vault <--"HTTPS (mTLS)"--> HPCSS
107+
108+
LN <--"CLI"--> SC
109+
110+
SC <--"Scheduling"--> PCPU
111+
SC <--"Scheduling"--> PGPU
112+
113+
SPADP <--"TCP"--> SPS
114+
SPACP <--"TCP"--> SPS
115+
116+
HPCSCDPB <--"HTTPS (mTLS)"--> HPCSS
117+
HPCSCCPB <--"HTTPS (mTLS)"--> HPCSS
118+
119+
HPCSCDPB <--"HTTPS"--> Vault
120+
HPCSCCPB <--"HTTPS"--> Vault
121+
122+
HPCSCCPB <--"CLI/Lib + UNIX Socket"--> SPACP
123+
HPCSCDPB <--"CLI/Lib + UNIX Socket"--> SPADP
124+
125+
CN1SA <--"TCP"--> SPS
126+
CN2SA <--"TCP"--> SPS
127+
CN3SA <--"TCP"--> SPS
128+
CN4SA <--"TCP"--> SPS
129+
130+
CN1SBATCH <--"HTTPS"--> Vault
131+
CN2SBATCH <--"HTTPS"--> Vault
132+
CN3SBATCH <--"HTTPS"--> Vault
133+
CN4SBATCH <--"HTTPS"--> Vault
134+
135+
HPCSCDPB <--"SSH (As user - Data & Info files)"--> LN
136+
HPCSCCPB <--"SSH (As user - Container image & Info files)"--> LN
137+
138+
HPCSCJPB --"SSH (As user - SBATCH file & CLI Call to SBATCH)"--> LN
139+
LN --"SSH (As user - Info files)"--> HPCSCJPB
140+
```
141+
142+
This diagram doesn't show the HTTPS requests from client/compute node to HPCS Server used to register the agents since this behaviour is a practical workaround.

0 commit comments

Comments
 (0)