Disk reservation: Difference between revisions
Line 59: | Line 59: | ||
== Reserve disks and nodes separately == | == Reserve disks and nodes separately == | ||
You may, for example, decide to reserve some disks for one week, but the nodes where your disks are located only when you want to carry out an experiment. | You may, for example, decide to reserve some disks for one week, but the nodes where your disks are located only when you want to carry out an experiment. | ||
=== First : reserve the disks === | === First: reserve the disks === | ||
Since we want to reserver disks only in a first time, we use the '''noop''' job type: with this '''noop''' job type, OAR will not try to execute anything on the job resources (which is what we want since disk resources are not capable of ''executing'' programs). | Since we want to reserver disks only in a first time, we use the '''noop''' job type: with this '''noop''' job type, OAR will not try to execute anything on the job resources (which is what we want since disk resources are not capable of ''executing'' programs). | ||
Line 75: | Line 75: | ||
{{Term|location=fnancy|cmd=<code class="command">oarsub</code> <code>-r "2018-01-01 00:00:00" -t noop -l {"type='disk' and cluster='grimoire'"}/host=2/disk=ALL,walltime=168</code>}} | {{Term|location=fnancy|cmd=<code class="command">oarsub</code> <code>-r "2018-01-01 00:00:00" -t noop -l {"type='disk' and cluster='grimoire'"}/host=2/disk=ALL,walltime=168</code>}} | ||
=== | === Second: reserve the nodes === | ||
You can then reserve nodes grimoire-1 and grimoire-2 for 3 hours, in the usual way: | You can then reserve nodes grimoire-1 and grimoire-2 for 3 hours, in the usual way: | ||
{{Term|location=fnancy|cmd=<code class="command">oarsub</code> <code>-I -t deploy -l {"host in ('grimoire-1.nancy.grid5000.fr', 'grimoire-2.nancy.grid5000.fr')"}/host=2,walltime=3</code>}} | {{Term|location=fnancy|cmd=<code class="command">oarsub</code> <code>-I -t deploy -l {"host in ('grimoire-1.nancy.grid5000.fr', 'grimoire-2.nancy.grid5000.fr')"}/host=2,walltime=3</code>}} |
Revision as of 15:06, 10 May 2017
Disk reservation is a technique to reserve nodes' hard disks, in order to locally store large datasets between your reservations, and avoid needing to move data to nodes at the beginning of each node reservation.
It takes benefit from nodes which feature a bunch of disks.
Note | |
---|---|
During the beta testing phase, disks reservation will only be available on the grimoire cluster. The service will be extended to other clusters later. |
The table below shows the cluster compatibility with disk reservation.
Site | Cluster | Number of nodes | Number of reservable disks per node |
---|---|---|---|
Nancy | grimoire | 8 | 5 |
Warning | |
---|---|
Disk reservation is a service for experimentation: data integrity is not guaranteed. You MUST backup your important data. |
How it works
Disk reservation works with jobs of type deploy and jobs of type noop. You can reserve some disks in the long run with a job of type noop. Then, you may reserve a job of type deploy on the same nodes as those where you reserved your disks.
When the job of type deploy starts, the disks you reserved are enabled by the RAID card of the node, and the other disks are disabled. Reserved disks can only be accessed by the user who reserved them (see also Security issues).
Usage
The main commands to reserve disks are given below.
The maximum duration of a disk reservation is defined in the Usage Policy.
Reserve disks and nodes at the same time
- Reserve a node with only the main disk (none of the additional disks), on the grimoire cluster
(no change to the way a node was to be reserved in the past, before the disk reservation mechanism existed.)
- Reserve a node with all its disks
fnancy :
|
oarsub -I -t deploy -l {"type='disk' or type='default' and disk_reservation_count > 0"}/host=1 |
(disk_reservation_count is equal to the number of disks which can be reserved on a node.)
- Reserve nodes grimoire-1 and grimoire-2 with one reservable disk per node
or, same but using environment variables:
fnancy :
|
oarsub -I -t deploy -l {"type='default' and host in ($HOSTNAMES)"}/host=$HOSTCOUNT+{"type='disk' and host in ($HOSTNAMES)"}/host=$HOSTCOUNT/disk=1 |
Reserve disks and nodes separately
You may, for example, decide to reserve some disks for one week, but the nodes where your disks are located only when you want to carry out an experiment.
First: reserve the disks
Since we want to reserver disks only in a first time, we use the noop job type: with this noop job type, OAR will not try to execute anything on the job resources (which is what we want since disk resources are not capable of executing programs).
(Please mind that Jobs of type noop cannot be interactive: oarsub
-I -t noop ...
is not supported.)
3 examples:
Reserve two disks on grimoire-1 for one week, starting on 2018-01-01:
fnancy :
|
oarsub -r "2018-01-01 00:00:00" -t noop -l {"type='disk' and host='grimoire-1.nancy.grid5000.fr'"}/host=1/disk=2,walltime=168 |
Or reserve the first two disks on grimoire-2:
fnancy :
|
oarsub -r "2018-01-01 00:00:00" -t noop -l {"type='disk' and host='grimoire-2.nancy.grid5000.fr' and disk in ('1', '2')"}/host=1/disk=2,walltime=168 |
Or reserve all disks on two nodes:
fnancy :
|
oarsub -r "2018-01-01 00:00:00" -t noop -l {"type='disk' and cluster='grimoire'"}/host=2/disk=ALL,walltime=168 |
Second: reserve the nodes
You can then reserve nodes grimoire-1 and grimoire-2 for 3 hours, in the usual way:
fnancy :
|
oarsub -I -t deploy -l {"host in ('grimoire-1.nancy.grid5000.fr', 'grimoire-2.nancy.grid5000.fr')"}/host=2,walltime=3 |
You must respect this order : reserve the disks first, then reserve the nodes. Otherwise the disks you reserved will not be available on your nodes.
Show and use my reserved disks
Reservations of both nodes (processors) and disks are displayed on the following Gantt diagram.
Once connected to a node in a deploy job with one or more disks reserved locally on the node, several tools can be used to manage the disk(s):
lsblk
will show you the block devices of your disks: sdb, sdc, ... (be careful: sda is the system disk) ;- Commands like
fdisk
parted
can be used to partition the disk, if need ; mkfs
can be used to format the disk.
Mind that the platform provides access to the block devices, but does not manage partitioning nor formatting.
Security issues
The mechanism used to enable/disable disks is designed to avoid mistakes from other users. However, a malicious user could take control of the RAID card, enable any disk, and access or erase your data. Please mind securing your data:
- Keep a copy (backup) in a safe place if relevant for your data ;
- If your data is sensitive, mind using cryptographic mechanisms to secure it.
Also, the data on reserved disks is not automatically erased at the end of your job. If you don't want the next user to access it, you have to erase it yourself.