FAQ
About this document
How to add/correct an entry to the FAQ?
Just like any other page of the wiki, you can edit the FAQ yourself to improve it. If you click on one of the little "edit" placed after each question, you'll get the possibility to edit that perticular question. To edit the whole page, simply choose the edit tab at the top of the page.
Accessing Grid'5000
How to directly connect to nodes from my workstation?
This tip consists of chain SSH command and automate this chaning with SSH configuration file ~/.ssh/config
. We use for this the nc
or tcpconnect
commands in order to bind stdin
and stdout
to a network connection:
Host frontale.*.grid5000.fr Host *.grid5000.fr Userlogin
ProxyCommandssh
login
@frontend
"nc
%h %p"
Please have a look at the SSH page to a deeper understanding of this proxy feature.
Note: Hostnames like *.grid5000.fr
are unknown outside of Grid'5000.
Warning: nc
and tcpconnect
commands are not always available on frontends.
How to access Internet from nodes?
For security reason, it is not possible to connect to Internet from inside Grid'5000. However, SSH port forwarding is not disabled yet, so the following command, that forward connection to host
:port
when we connect to host_g5k
:port_g5k
:
ssh
-R port_g5k:host
:porthost_g5k.site.grid5000.fr
Please have a look at "How to directly connect to nodes from my workstation?" to combine with this tips.
Warning: If it is a real need, you have to tell it to your local administrator, before he notices the forwarding by himself.
Note: This method can be combined with a web proxy, like Squid or Apache, to access Internet web servers.
How to access InriaGforge from frontends
It can be useful to access INRIA's Gforge repository from frontends to directly checkout, commit or even synchronize your project data between Grid'5000's sites.
Due to security policy, access to InriaGforge from inside Grid'5000 is by default prohibited. Decision to allow access to InriaGforge depends on each site. Following frontends can access InriaGforge:
frontale.bordeaux.grid5000.fr frontale.nancy.grid5000.fr frontale.rennes.grid5000.fr frontale.sophia.grid5000.fr
For other frontends, that do not allow access to InriaGforge, there is a little SSH trick. Edit your SSH configuration file ~/.ssh/config
on the host used to access Grid'5000 sites (do not forget to replace site
by targeting site, g5klogin
by your Grid'5000 username and fwdport
by an arbitrary port number > 1024):
Host acces.site
.grid5000.fr Userg5klogin
RemoteForwardfwdport
scm.gforge.inria.fr:22
Now connect to configured host (localhost
:fwdport
will be forwarded to scm.gforge.inria.fr
:22):
ssh
acces.site
.grid5000.fr
Then modify your SSH configuration file ~/.ssh/config
on this Grid'5000 frontend (do not forget to replace gforgelogin
by your InriaGforge username and fwdport
by the previously defined forwarded port number):
Host scm.gforge.inria.fr Hostname localhost Portfwdport
Usergforgelogin
You can checkout and commit your InriaGforge project
repository:
svn
co svn+ssh://scm.gforge.inria.fr/svn/project
Account management
Why does my home directory not contain the same files on every site ?
Every site has its own file server, this is the user's responsibility to synchronize the personal data between his home directory on the different sites. You may use the rsync
command to synchronize a remote site home directory (be careful this will erase any file that are not the same as on the local home directory):
rsync
-n --delete -avz ~ frontend.site
.grid5000.fr:~
NB : please remove the -n argument once you are sure you actually don't want to do a dry-run only...;)
How to restore a wrongly deleted file?
No backup falicity is provided by Grid'5000 platform. Please watch your fingers and do backup your data using external backup services.
How to fetch all the SSH host keys of one site?
To avoid answering 'yes' when connecting with SSH for the first time to hosts, the ~/.ssh/known_hosts
file can be automatically generated for one site:
nodelist
site
|ssh-keyscan
-tdsa,rsa -f -
Please have a look at "How to get a site list of nodes?", for information on the nodelist
command.
How to avoid SSH host key checking?
With the StrictHostKeyChecking
option, SSH host key checking can be turned of. This option can be set in the ~/.ssh/config
file:
StrictHostKeyChecking no
Or it can be passed on the command line:
ssh
-o StrictHostKeyChecking=nohost
How not to get tons of SSH errors about Man-in-the-middle attacks while deploying images ?
This is because SSH get worried by the fact that the machine answering to its connexion is not the same from run to run. Actually, this is really logical: you just redeployed the image so it cannot be the same system answering...
Technically speaking, the file /etc/ssh/ssh_host_dsa_key.pub
is likely to be different in your own deployed image and in the default image. SSH will thus freak out since such replacement usually denote that someone is intercepting the communication and pretend to be the server to get informations from you.
There is two solutions, at least:
- Add
StrictHostKeyChecking=no
to your.ssh/config
file to explain SSH to ignore about those errors. - Pass this option (
StrictHostKeyChecking=no
) on the command line to ssh (using -o) - Make sure that you have the same
host_dsa_key
in your own images than in defaults one. They can usually be found in the pre/post install scripts of your site.
Please have a look at the SSH page also.
How to get a site list of nodes ?
Node naming convention suggests that the node name must be node-xx.site.grid5000.fr
. So, we can extract from DNS the node list of a given site, as the following BASH function does:
function nodelist {dig
"${1}.grid5000.fr" axfr |grep
-E '^node-[0-9]+\.' |cut
-f 1 -d ' ' }
For example, if you want Lyon's node list:
nodelist
lyon
Note: node naming convention does not work for sites having more than one cluster.
How to quickly check for nodes health
You can check for nodes health, based on ICMP request, with nmap
command if it is available on a site:
nodelist
site
|nmap
-iL - -sP
Or with fping
command, if it is available:
nodelist
site
|fping
-a 2> /dev/null
How to kill all my processes?
- On the currently connected host (warning, it will disconnect you)
kill
-KILL -1
- From the frontend, for every site node:
for node in `nodelistsite
` ; dossh
-o StrictHostKeyChecking=no $node kill -KILL -1 done
Please have a look at "How to get a site list of nodes?", for information on the nodelist
command.
What is the so called "best-effort" mode of OAR?
The best-effort was implemented to back-fill the cluster with jobs considered as less important without blocking "regular" jobs. To submit jobs under that policy, you simply have to select the besteffort queue with the -t flag of oarsub.
oarsub
-t besteffort
script_to_launch
Jobs submitted that way will only get scheduled on processes when no other job use them (any regular job overtake besteffort jobs in the waiting queue, regardless of submission times). Moreover, these jobs are killed (as if oardel were called) when a regular job recently submitted reclames the nodes they use.
No checkpointing or automatic restart of besteffort jobs is provided. They are just killed. That is why this mode is best used with a tool which can detect the killed jobs and resubmit them. CIGRI or APST are examples of such facility.
How to pass arguments to my script
When you do passive submission through oarsub
, you must specify a script. This script can be a simple script name or a more complex command line with arguments.
To pass arguments, you have to quote the whole command line, like in the following example:
oarsub
-l nodes=4,walltime=2"/path/to/myscript arg1 arg2 arg3"
Note: to avoid random code injection, oarsub
allows only alphanumeric characters ([a-zA-Z0-9_]
), whitespace characters ([ \t\n\r\f\v]
) and few others ([/.-]
) inside its command line argument.