Armored Node for Sensitive Data
Note | |
---|---|
This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team. |
This page documents how to secure a Grid'5000 node, making it suitable to host and process sensitive data. The process is based on a tool (g5k-armor-node.py) that runs on top of the debian11-x64-big Grid'5000 environment.
Important limitations about this solution
- The solution does not protect against user errors during the setup of the secure environment. Please ensure that you follow this documentation with extreme care. Failing to do so could result in an insecure environment.
- The solution does not protect against user errors that could result in transferring sensitive data outside the secure environment (the Internet is reachable from the secure environment). Please ensure that you use this environment with care.
- The solution does not protect the rest of Grid'5000 against your node. Before using this solution to work on software that might attack other Grid'5000 machines (for example malware), please consult with the Grid'5000 technical staff.
Informing the technical team
Before starting to use Grid'5000 to process sensitive data, inform the technical team that you are going to do so. Email support-staff@lists.grid5000.fr with the following information:
- your name
- your affiliation
- the general description of your planned work and the kind of data that you are going to process (do not include sensitive information here)
- the description of the resources that you are going to reserve
- the expected duration of your work
Node reservation, deployment, and securisation
Identify your requirements
- Select a cluster that suits your needs (for example using the Hardware page).
- Estimate for how long you will need the resources. If they exceed what is allowed for the default queue in the Usage Policy, maybe the production queue will match your needs. If the duration also exceeds what is allowed by the production queue (more than one week), you can follow the procedure explained on the Usage Policy page to request an exception.
- Take into consideration that all data (including data you produced) stored locally on the machine will be destroyed at the end of the reservation.
- Reserve a node and a VLAN, then deploy the node with the debian11-x64-big environment inside the VLAN (see detailed steps below).
Reserve and setup your node (option 1: manually)
- Make a reservation
Reserve the node and the VLAN. Example for a reservation in the production queue for one node of cluster CLUSTER starting at START DATE for a duration of WALLTIME:
nancy frontend:
oarsub
-q production -t deploy -t destructive -l {"type='kavlan'"}/vlan=1+{CLUSTER
}/nodes=1,walltime=WALLTIME
-rSTART DATE
Note that additional disks available on the node (that may need an extra reservation) will be used as additional secured storage space, but data will always be destroyed at the end of the node reservation.
Once the job has started, connect inside the job:
frontend:
oarsub
-CJOB ID
Note that since it is a deploy job, the job shell opens on the frontend again.
Take note of the hostname of the reserved node for instance with oarprint:
frontend:
oarprint host
Take note of the assigned VLAN number:
frontend:
kavlan
-V
- Deploy the debian11-x64-big environment
Deploy the node with the debian11-x64-big environment, inside the VLAN:
frontend:
kadeploy3
-e debian11-x64-big --vlan `kavlan -V`
Now wait for the deployment to complete.
- Securing the node with g5k-armor-node.py
Connect to the node from the outside of Grid'5000, using the node's hostname in the VLAN (hostname with the Kavlan suffix for the reserved VLAN, because the node was deployed inside the kavlan VLAN). After securing the node, this will be the only allowed way to connect to the node, as SSH will only be authorized from Grid'5000 access machines:
your machine:
ssh
-JYOUR_G5K_LOGIN
@access.grid5000.fr root@node-X-kavlan-Y.site
.grid5000.fr
On the node, download g5k-armor-node.py, for example with:
node:
wget
https://gitlab.inria.fr/grid5000/g5k-armor/-/raw/master/g5k-armor-node.py
Run it:
node:
chmod
a+rx g5k-armor-node.pynode:
./g5k-armor-node.py
Wait for the script to finish (it must have displayed the Setup completed successfully!
message).
Disconnect from the node, and try to connect again using SSH. You should get an error message from SSH, because the node's host key changed. This is expected: the script replaced the node's SSH host key with a newly generated one. Follow the instructions from SSH to remove the old key.
Reserve and setup your node (option 2: automated with grd)
grd is a command-line tool to automate Grid'5000 workflows. It has support for performing the steps above. For example, to reserve and configure a node from the grappe cluster in nancy for 2 hours, use (from a frontend or locally, after installing ruby-cute which contains grd):
frontend:
grd
bs -s nancy -q production -l {grappe}/nodes=1+{"type='kavlan'"}/vlan=1 -w 2 --armor
As described above, you might get an error message from SSH, because the node's host key changed. This is expected: the script replaced the node's SSH host key with a newly generated one. Follow the instructions from SSH to remove the old key.
Using the secured node
Connect the secured node
You must connect to the node using your Grid'5000 login directly from your workstation:
your machine:
ssh
-JYOUR_G5K_LOGIN
@access.grid5000.frYOUR_G5K_LOGIN
@node-X-kavlan-Y.site
.grid5000.fr
The node can access the Internet and you can use the sudo command on the node to install additional software if needed.
Please remember that:
- Only your home directory on the secured node is encrypted (
/home/<username>
). You must not store sensitive data outside of it (or on other Grid'5000 machines). - You must only use secured protocols to transfer data to/from the node as described below.
- If you reboot the node or if the node is shut down for some reason, you will no longer be able to access your data. However, if you made a copy of the encryption key when it was displayed at the end of the script's output, you can restore the encrypted storage from the node with:
echo '<paste key content here>' > /run/user/1000/key sudo cryptsetup luksOpen --key-file /run/user/1000/key /dev/mapper/vg-data encrypted sudo mount /dev/mapper/encrypted $HOME exit
Then reconnect to the node.
If you prefer to avoid keeping a copy of the encryption key, it is a good idea to make intermediary backups of the processed data (outside of Grid'5000), in case the secured node becomes unreachable during the processing.
Transferring data to/from the node
You must transfer data directly between an external secure storage, and your Grid'5000 node. You must not use other Grid'5000 storage spaces (such as NFS spaces) in the process.
It is recommended to use rsync. Using rsync, you can specify access.grid5000.fr as a SSH JumpHost using the -e
option. Alternatively, you can customize your SSH configuration as described in the Getting Started tutorial.
- To transfer files to the node:
rsync -e "ssh -JYOUR_G5K_LOGIN
@access.grid5000.fr"<local path>
YOUR_G5K_LOGIN
@node-X-kavlan-Y.site
.grid5000.fr:<remote path>
- To fetch files from the node:
rsync -e "ssh -JYOUR_G5K_LOGIN
@access.grid5000.fr"YOUR_G5K_LOGIN
@node-X-kavlan-Y.site
.grid5000.fr:<remote path>
<local path>
Data management
Several solutions are possible to manage the sensitive you need to use on the node.
Solution A: store data outside Grid'5000
You could store the data in a secure storage space outside Grid'5000, and copy it to/from the node, as described above, when needed.
Main limitation of this solution: it is is not suitable if the data volume is important (because of the transfer time).
Solution B: store data in an encrypted archive inside Grid'5000
Main limitation of this solution: it is is not very practical because of the need to frequently decrypt+decompress / compress+encrypt.
Solution C: use a remote secured volume (CompuVault)
Troubleshooting
Warning | |
---|---|
If you experience any issue during the securisation procedure, do not continue further your experiment. The node might not be correctly secured, and thus your data not well protected. |
Rerun the securing procedure from the begining
You can try to rerun all the procedure from the begining execept that you do not need to execute the oarsub command (if the job is still running).
Connect the frontend (the one, you previously used) and connect inside the job:
frontend:
oarsub
-CJOB ID
Format the node and deploy debian11-x64-big environment on it:
frontend:
kadeploy3
-e debian11-x64-big --vlan `kavlan -V`
Finally, download and execute the python script as described in the "Securing the node with g5k-armor-node.py" section.
Have more output information on each step for debugging
If you still experience an issue during the procedure, you might want to display more output information for debbuging and understanding the issue.
To do so, do the following:
- for the "deploying debian11-x64-big environment" step, add
--verbose-level 5
frontend:
kadeploy3
-e debian11-x64-big --vlan `kavlan -V` --verbose-level 5
- for the "securing script" step, set the environment variable
GAN_DEBUG
to 1 :
node:
GAN_DEBUG=1 ./g5k-armor-node.py
Contact the technical team
If the script is not working properly or you have any question on the procedure, do not hesitate to contact the technical team BEFORE running any experiment with sensitive data: support-staff@lists.grid5000.fr. Please include all relevant details that could help the technical team to understand your problem (do NOT send any sensitive data by mail).
For instance, if the script g5k-armor-node.py
is not working properly, please run it with the debug mode (see the previous section) and copy/paste potential error messages on the email you will send to the technical team.
Extending node reservation beyond normal limits
A limitation of this solution is the frequent need for setting up the node and importing the required data.
A way to mitigate this problem is to extend the reservations beyond what is normally allowed by Grid5000 policies (7 days max). However :
- This add constraints for maintenances for the Grid5000 technical team
- It is generally considered a bad practice to reserve resources (which prevents other users from using them) and then not use them
If really needed, this possibility should be discussed with the user's security correspondant and with the Grid5000 technical team. A prerequisite for this discussion is that the user clarify the hardware that could match its needs, using for example the Hardware page.