-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Storage tiers
The storage is organized across multiple tiers. The distinguishing characteristics for the tiers are:
- speed (throughput and latency),
- size,
- accessibility (temporal and locational persistency), and
- robustness (redundancy and back-ups).
Usually
- speed is inversely proportional to size, robustness, and accessibility, and
- size, robustness, and accessibility are proportional to each other.
Only low speed storage (i.e. the Isilon NFS mount) will be accessible to all clusters in the future. Thus, Isilon will become crucial in the future in maintaining uniform data access across all clusters.
File systems accessible through the HPC Infiniband network
The HPC file systems are meant to store working data, and are not meant for long term storage. The scratch file system and project directories store large temporary input/output files, the home directory is meant for working storage, and then we have local file systems accessible through /tmp
(local persistent memory) and /dev/shm
(virtual memory) that are fast, available in jobs, and wiped out when the job finishes. Finally project storage is meant to store finalized input and output files.
However, there are file systems that are accessible through slower network connections and offer different kinds of features.
File systems not accessible through Infiniband
The central university storage is slower, but snapshotted and backed up much more regularly. Therefore users should transfer their data to the central systems for long tern storage.
However, there are multiple options of accessing the central university storage. There are the systems Atlas, Ebenezer, Isilon-DMZi, and Isilon-DMZe.
- What is the difference between Atlas, Ebenezer, and Isilon?
- What is the difference between Isilon-DMZi and Isilon-DMZe?
- How are user quota managed in central storage systems, and how can users see the usage limits?
The Isilon file system
Isilon is actually the name of the technical solution: https://www.dell.com/fr-fr/dt/storage/isilon/isilon-h5600-hybrid-nas-storage.htm#scroll=off
There are 2 central storage servers to Hyacithe's knowledge, which are operated by the SIU, the "isilon-prod" and "isilon-drs" (off site replica of "isilon-prod", in case of disaster on "isilon-prod").
The isilon-prod is split in (at least) two zones:
- the SIU zone, that accessed using SMB via
atlas.uni.lux
, and - the HPC zone, that is mounted in the clusters with NFS and can be accessed in
/mnt/isilon
.
For the HPC side, we are on;y interested about the NFS mounted file system. Documentation about Isilon: https://hpc-git.uni.lu/ulhpc/sysadmins/-/wikis/storage/isilon
-
The processes for the HPC zone are not well defined or documented. We can set up quota per project directory, but there's no way to show this information to the users. We are working on providing users with access to this information and setting up a policy for assigning quota.
-
We share the Isilon system with the SIU. There is a "fair use agreement" in place which allocated 2PB for the HPC zone, currently used at 88% of the full capacity. Maintaining access to the Isilon system is important moving forward, as the Isilon file system will be the only system unifying data access across our future clusters. We should participate in any future calls and coordinate with SIU.
-
In terms of performance, performance is abysmal with small random I/Os, for instance small files, metadata, etc. The Isilon NFS mount works well for administrative needs, like archiving and occasional data transfers, and even for big file I/O. But don't try to perform any compute driven operation on NFS mounted Isilon, like compile a software on it, or anything similar.
The Atlas file system
The SMB protocol allows for easy mounting of file systems on personal computers, including Windows machines.
The HPC team is not managing the file system exported through SMB from Atlas (atlas.uni.lux
). However, the HPC team maintains the smb-storage
script (under active development) that allows mounting SMB shares on the login nodes of our clusters.
Fun fact: you can access the HPC zone via samba on your workstation using your Active Directory credentials. This works via a fragile script to map windows/POSIX permissions and user accounts from the HPC-IPA to the SIU Active Directory. This was requested by LCSB Bio-core in 2014. The system still works but it is no longer supported. Honestly, if you are using linux you can get the performance of SMB with SSHFS: https://blog.ja-ke.tech/2019/08/27/nas-performance-sshfs-nfs-smb.html
Add some instruction on how to fix errors in access permissions
The discussion of data management is a bit unorganized. We should probably reorganize the sections and add some information on how users can fix their projects when errors occur.
To fix access permissions in a project directory,
- change ownership,
chown -R :<project name> /work/projects/<project name>
- and then change access rights:
find /work/projects/<project name> -type d | xargs chmod g=rxs
Also, add a link with more resources: https://www.redhat.com/sysadmin/suid-sgid-sticky-bit