Remote Filesystems Overview
This document gives a brief overview on how the central file system storage capacity available at the institute is currently organized and it explains in short use cases for each type of file system.Persistent on-line storage is organized at the institute by means of UNIX* type file systems on central servers (file servers). These file systems can be remotely accessed (mounted) by means of network file system (NFS) software which is available for Microsoft, Linux and Mac OS-X operating systems.In order to keep storage organization as simple as possible file systems are classified as: HOME, PROJECT, CLUSTER PROJECT, and CLUSTER SCRATCH file systems. HOME and PROJECT file systems are based on a single file server (hostname: fs01) while CLUSTER PROJECT and CLUSTER SCRATCH file systems are base on the cluster (hostname: cluster01). Each of the file system classes is designed for a specific purpose or usage pattern.
For all employees and official guests of the institute a central HOME file system will be provided automatically together with the system user account. The purpose of HOME file systems is to store important personal user data, such as documents, programs, presentations, personal keys to access the clusters and the like. HOME file systems are not designed to store huge scientific filesets or raw output data of parallel simulation experiments.
- Capacity of HOME file systems is currently limited by a 50 GByte quota per employee and a 15 GByte quota per guest or external collaborator. These quotas can be lifted to about 100 GByte per user upon request if required.
- HOME file systems are being hosted on a single server computer (fs01).
- HOME file systems are automatically backed up every night and three copies are held of each file which changed. The maximum retention period for the latest version of a file is 30 days, in other words: a file which has been deleted from a file system can be restored 29 days after its deletion but not later.
- HOME file systems are mounted automatically on PIK application servers (aix, cluster) under the /home/[user account name] directory and they can be mounted manually onto every personal computer.
Every official PIK or third party research project may request the configuration of a project file system. The request shall include the initial amount of capacity required, the name of the project group to be granted access to the file system and the members and leaders of that group. There`s no formal process - request can be send by e-mail to Karsten Kramer any time. It is, however important to note, that PROJECT file systems come in two flavors: STANDARD PROJECT file systems or CLUSTER PROJECT file systems. A choice shall be made in view of the required capacity and predominant usage pattern of the project.
The purpose of STANDARD PROJECT file systems is, to store shared documents and project data, including important filesets which have been already post processed from raw data. PROJECT files systems may store raw output data of parallel simulation experiments but they are not designed to cope with huge amounts of temporary data.
- STANDARD PROJECT file systems are configured upon request up to a maximum capacity of 1 TByte of data per project.
- STANDARD PROJECT file systems are being hosted by a single server computer (fs01).
- STANDARD PROJECT file systems are automatically backed up every night but only one copy of each file is held - that is no previous versions of a file could be restored, just the last one. The maximum retention period of a file is 30 days, in other words: a file which has been deleted from a file system can be restored 29 days after its deletion but not later.
- STANDARD PROJECT file systems are mounted automatically on PIK application servers (aix, cluster) under the /data/[project name] directory and they can be mounted manually onto every personal computer.
The purpose of CLUSTER PROJECT file systems is to store huge amounts of data. Because these file systems are directly hosted on the cluster they are very well suited to store raw output data of parallel simulation experiments.
- CLUSTER PROJECT file systems have a default capacity of 20/25TByte (soft/hard quota)* which can be enhanced upon request if required. Limits are enforced by means of group quotas (use mmlsquota on cluster login nodes in order to list remaining capacity).
- CLUSTER PROJECT file systems are being hosted directly on the cluster computer as part of a single parallel file system, exported over the network from one cluster node (cluster01). Currently this has the following implications:
- On the cluster CLUSTER PROJECT file systems are mounted automatically under the /iplex/01/[year]/[project name] directory.
- On PIK application servers (aix, cluster) they are mounted automatically under the /data/iplex01/[project name] directory but not under the data/[year]/[project name] directory. This is a technically nuisance we`re trying to circumvent in the future, but it still need to be considered today.
- CLUSTER PROJECT file systems are automatically backed up every day but only one copy of each file is held - that is no previous versions of a file could be restored, just the last one. The maximum retention period of a file is 30 days, in other words: a file which has been deleted from a file system can be restored 29 days after its deletion but not later.
As the name suggests a scratch filesystem is a temporary filesystem designed to store huge amounts of raw output data from simulation experiments over a limited period of time (say: 30 days) for post processing or archiving.
- Every cluster user has a default 8/10 TByte (soft/hard)* quota on the file system.
- The CLUSTER SCRATCH file system is being hosted directly on the cluster computer as part of a single parallel filesystem, exported over the network from one cluster node (cluster01). Currently this has the following implications:
- On the cluster the CLUSTER SCRATCH filesystem is mounted automatically under the /scratch/01/ directory.
- On PIK application servers (aix, cluster) it is mounted automatically under the /data/scratch01/.
- Attention: no backup is available for data stored in the scratch filesystem!
- Attention: all files and directories in a scratch filesystem, which have not been accessed for an specific amount of time - currently for 3 months - will be deleted automatically!
