Skip to content

Conversation

@dlq
Copy link

@dlq dlq commented Jan 7, 2026

Motivation

  • Improve documentation by adding a high-level system architecture diagram to make component roles and interactions clearer.
  • Provide a visual overview that shows how BrainPortal, NeuroHub, Bourreau, data providers, and HPC resources relate.
  • Help new contributors and operators understand the overall deployment and data/job flow.

Description

  • Added a new ## System architecture section with a Mermaid flowchart diagram to README.md that models users, NeuroHub, BrainPortal, the shared database, data providers, Bourreau, HPC scheduler, compute nodes, and shared scratch.
  • Included an explanatory paragraph describing how BrainPortal orchestrates data access and delegates execution to Bourreau, and how Bourreau interacts with HPC schedulers and shared storage.
  • The change was applied to README.md and committed with the message Add system architecture diagram to README.

Testing

  • No automated tests were run for this documentation-only change.
  • Repository CI (badge present in README) will exercise automated checks on the PR when submitted.

Codex Task

@MontrealSergiy
Copy link
Contributor

MontrealSergiy commented Jan 8, 2026

I think it is a great addition, but media (mermaid) is not advanced enough for such a complex software.

A safer approach I think is cut and paste structural diagram(s) verbatim from Pierre presentations, conference poster or other publications.

As it was noted in slack, the accuracy is questionable.

  • I think Datasets are accessed via data Providers,

  • NeuroHub and Cbrain UIs are almost parallel, both CBRAIN Portal and NeuroHub Portal are rails apps, which however share same db and resources.

@MontrealSergiy
Copy link
Contributor

verbiage needs a check

e.g, in

Bourreau can also fetch and stage data from providers as part of backend task execution.

can is redundant if not misleading

Copy link
Contributor

@MontrealSergiy MontrealSergiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok with me

Copy link
Contributor

@MontrealSergiy MontrealSergiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normally bourraux do not use ssh to access Slurm, please correct

'

@MontrealSergiy
Copy link
Contributor

MontrealSergiy commented Jan 8, 2026

Please assign yourself, add Documentation label
to PR

@MontrealSergiy
Copy link
Contributor

Just FYI, there could be local data providers and Bourreau ( do not add them to the diagram, it is already big)

I would have two separate External Compute and External Storage resources (flat is better than nested)

@MontrealSergiy
Copy link
Contributor

MontrealSergiy commented Jan 8, 2026

There is no HTTP Data Provider!
Replase with ssh, etc

Also maybe instead of two identical Data Providers have one with ssh another s3 ?
Perhaps same with bourreau - the first one is slurm, the second Cloud Batch or 'any other HPC scheduler: ...'

@MontrealSergiy
Copy link
Contributor

MontrealSergiy commented Jan 12, 2026

btw ftp is not secure, nothing to brag about even if it supported. Maybe best mention datalad, squashfs, or at least sftp

@prioux
Copy link
Member

prioux commented Jan 12, 2026

@MontrealSergiy We don't use FTP anywhere. We use SFTP.

@MontrealSergiy
Copy link
Contributor

Therefore ftp should not appear on the diagram

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants