* accessing data from various monitoring systems ([[Power_Measurement|power consumption]], network, etc.)
* accessing data from various monitoring systems ([[Power_Measurement|power consumption]], network, etc.)
* performing experiments using large amounts of data with [[Storage5k]] (storage reservation) and [[DFS5k]] (storage cluster deployment)
* performing experiments using large amounts of data with [[Storage5k]] (storage reservation) and [[DFS5k]] (storage cluster deployment)
* more advanced usage of [[Advanced_OAR|OAR]] and [[Advanced_Kadeploy|Kadeploy]]

=== Recommended tips and tricks ===
=== Recommended tips and tricks ===

This tutorial will guide you through your first steps on Grid'5000. Before proceeding, make sure you have a Grid'5000 account (if not, follow this procedure), and an SSH client.

Getting support

The Support page describes how to get help during your Grid'5000 usage. There's also a FAQ and a cheat sheet with the most common commands.

Note.png Note

This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.

Connecting for the first time

The primary way to move around Grid'5000 is using SSH. If you are not familiar with SSH, please consider the specific SSH and Grid'5000 tutorial. A reference page for SSH is also maintained with advanced configuration options heavy users will find useful.

As described in the figure below, when using Grid'5000, you will typically:

  1. connect, using SSH, to an access machine
  2. connect from this access machine to a site frontend
  3. on this site frontend, reserve resources (nodes), and connect to those nodes
Grid5000 Access

Connect to a Grid'5000 access machine

The address points to two actual machines: access-south in Sophia and access-north in Lille. Those machines provide SSH access to Grid'5000 from Internet.

Terminal.png outside:

You will get authenticated using the SSH public key you provided in the account creation form.

Note.png Note

If you modified your key in the account managment interface, please allow for some time (~ 5 minutes) for the change to be applied

Note.png Note

If you prefer, you might also be able to connect directly to your local Grid'5000 site, but per-site access restrictions are applied, so using is usually a safer choice. See External_access for details about local access machines.

Note.png Note

A VPN service is also available to directly connect to Grid'5000 hosts. See VPN for more information.

Connecting to a Grid'5000 site

Grid'5000 is structured in sites (Grenoble, Rennes, Nancy, ...). Each site hosts one or more clusters.

You should be able to connect to a particular site, either directly (blue arrow labelled SSH in the figure above) or after a hop through an other site (red arrows labelled SSH) in the figure above.

Terminal.png access:
ssh site

Password-less access within Grid'5000

Since May 12th, 2014, newly created accounts are created with a default configuration that includes a password-less key to move between sites once you are connected to an access machine using the key for which you have provided the public key in the account management interface.

Home directories

You have a different home directory on each Grid'5000 site, so you will usually use Rsync or scp to move data around. On access machines, you have direct access to each of those home directory, through NFS mounts. But please note that using that feature to transfer large volumes of data is inefficient.

Home directories on Grid'5000 are not backed up: it is your responsibility to save important data outside Grid'5000 (or to copy data to several Grid'5000 sites in order to increase redundancy). Quotas are applied -- by default, you get 25 GB per Grid'5000 site. If your usage of Grid'5000 requires more disk space, it is possible to request quota extensions in the account management interface, or to use other storage solutions (see Storage5k).

Discovering, visualizing and reserving Grid'5000 resources

At this point, you should be connected to a site frontend, as indicated by your shell prompt (login@fsite:~$). This machine will be used to reserve and manipulate resources on this site, using the OAR software suite.

Discovering and visualizing resources

There are several ways to learn about the site's resources and their status:

  • The site's MOTD (message of the day) lists all clusters and their features. Additionally, it gives the list of current or future downtimes due to maintenance, which is also available from
  • Site pages on the wiki (e.g. Nancy:Home) contain a detailed description of the site's hardware and network:
  • The Status page links to the resource status on each site, with two different visualizations available: Monika (see Nancy's current status) and Gantt (see Nancy's current status).
  • The Grid'5000 API (we'll look at that later on) provides a machine-readable description of Grid'5000 and machine-readable status information. This web UI can be used to discover resources. (Note that due to a bug, this interface do not currently work with the Chrome web browser. There is also an other bug with the deployment through the interface)

Reserving resources with OAR: the basics

To reserve one resource (= one CPU core, on one node) in interactive mode, run:

Terminal.png fnancy:
oarsub -I

As soon as a resource becomes available, you will be directly connected to the reserved resource with an interactive shell, as indicated by the shell prompt.

To terminate your reservation and return to the frontend, run:

Terminal.png graphene-42:

To avoid anticipated termination of your jobs in case of errors (terminal closed by mistake), you can reserve and connect in 2 steps using the job id associated to your reservation. First, reserve a node, and ask it to sleep:

Terminal.png fnancy:
oarsub "sleep 10d"

(10d stands for 10 days -- the command will be killed when the job terminates anyway)


Terminal.png fnancy:
oarsub -C job_id
Terminal.png graphene-42:
hostname && ps -ef | grep sleep

java -version
mpirun --version

env | grep OAR # discover environment variables set by OAR

Of course, you will probably want to use more than one node on a given site, and you might want them for a different duration than one hour. The -l switch allows you to pass a comma-separated list of parameters specifying the needed resources for the job.

Terminal.png fnancy:
oarsub -I -l nodes=2,walltime=0:30

The walltime format is [hour:min:sec|hour:min|hour] (walltime=5 => 5 hours, walltime=1:22 => 1 hour 22 minutes, walltime=0:3:30 => 3 minutes, 30 seconds).

By default, you can only connect to nodes in your reservation, and only using the oarsh connector to go from one node to the other. The connector supports the same options as the classical ssh, so it can be used as a replacement for software expecting ssh.

Terminal.png griffon-49:

uniq $OAR_NODEFILE # list of resources of your reservation
oarsh griffon-1 # use a node not in the file (will fail)
oarsh griffon-54 # use the other node of your reservation

ssh griffon-54 # will fail

oarsh is a wrapper around ssh that enables the tracking of user jobs inside compute nodes (for example, to enforce the correct sharing of resources when two different jobs share a compute node). If your application does not support choosing a different connector, it is possible to avoid using oarsh for ssh with the allow_classic_ssh job type, as in

Terminal.png fnancy:
oarsub -I -l nodes=2,walltime=0:30:0 -t allow_classic_ssh

Reservations in advance, job management, and selection of resources

Reservations in advance

By default, oarsub will give you resources as soon as possible. You can also reserve resources at a specific time in the future, with the -r parameter:

Terminal.png fnancy:
oarsub -l nodes=3,walltime=3 -r '2012-12-23 16:30:00'

Job management

To list jobs currently submitted, use the oarstat command (use -u option to see only your jobs). A job can be deleted with:

Terminal.png fnancy:
oardel 12345

Remember that all your resource reservations must comply with the Grid5000:UserCharter.

Selection of resources using OAR properties

The OAR nodes database contains a set of properties for each node, that can be used to request specific resources:

  • Nodes from a given cluster
Terminal.png fluxembourg:
oarsub -p "cluster='granduc'" -l nodes=5,walltime=2 -I
  • Nodes with Infiniband DDR interfaces
Terminal.png fgrenoble:
oarsub -p "ib20g='YES'" -l nodes=5,walltime=2 -I # (use ib40g for QDR)
  • Nodes with power sensors and GPUs
Terminal.png flyon:
oarsub -p "wattmeter='YES' and gpu='YES'" -l nodes=2,walltime=2 -I
  • Since -p accepts SQL, you could write
Terminal.png fnancy:
oarsub -p "wattmeter='YES' and network_address not in ('', '')" -l nodes=5,walltime=2 -I

The OAR properties available on each site are listed on the Monika pages linked from Status (example page for Nancy).

Monitoring your nodes

If you point your browser to, you will see all the metrics collected on Grid'5000. If you navigate first to the site, then to the node you have reserved, you will see the metrics collected for one node. Data sent to ganglia ages because it is stored in a Round Robin Database, and past values are aggregated as new values are saved in the system. In the default configuration, a new value is saved every 15s for the last hour before being aggregated into an average value for every 5 minutes. Furthermore, the web interface only displays graphs and does not give access to the raw values that a user might need to draw his own graphs.

Therefore, Grid'5000's API has a metrology API to access monitoring information. You can learn more about it in the dedicated tutorial but here are simple commands to retrieve cpu usage information for your nodes :

As the format is non-trivial to manipulate using bash scripting for any reasonable usage of this information, you should use higher level tools to manipulate metrology information for your usage of Grid'5000. Do not worry if you see null values as we will explain this further along.

Deploying your nodes to get root access and create your own experimental environment

Using oarsub without specific options gives you access to resources configured in their default (production) environment, with a set of software selected by the Grid'5000 team. You can use such an environment to run Java or MPI programs, or even to boot virtual machines with KVM, but you have no administrative privileges (root access) should your experiment require changing the software environment in a way or an other.

Most Grid'5000 users use resources in a different, much more powerful way: they use Kadeploy to re-install the nodes with their software environment for the duration of their experiment, using Grid'5000 as a Hardware-as-a-Service Cloud. This enables them to use a different Debian version, another Linux distribution, or even Windows, and get root access to install the software stack they need.

Deploying nodes with Kadeploy

Reserve one node (the deploy job type is required to allow deployment with Kadeploy):

Terminal.png fnancy:
oarsub -I -l nodes=1,walltime=1:45 -t deploy

Start a deployment of the wheezy-x64-base image on that node (this takes 5 to 10 minutes):

Terminal.png fnancy:
kadeploy3 -f $OAR_NODE_FILE -e wheezy-x64-base -k

The -f parameter specifies a file containing the list of nodes to deploy. Alternatively, you can use -m to specify a node (such as -m The -k parameter asks Kadeploy to copy your SSH key to the node's root account after deployment, so that you can connect without password. If you don't specify it, you can still connect, but SSH will ask you for a password. The root password for all Grid'5000-provided images is grid5000.

Reference images are named debian version-architecture-type. The debian version can be squeeze (Debian 6.0, released in 02/2011), wheezy (Debian 7.0, released in 02/2013), jessie (Debian 8, released in 04/2015). The architecture is x64 (in the past, 32-bit images were also provided). The type can be:

  • min: a minimalistic image with no Grid'5000-specific customizations (the default configuration provided by Debian is used)
  • base: min + various Grid'5000-specific tuning for performance (TCP buffers, number of open file descriptors, drivers for Infiniband and Myrinet networks, etc.). Those could incur an experimental bias.
  • nfs: base + support for mounting your NFS home, and using your Grid'5000 user account on deployed nodes
  • big: nfs + packages for development, system tools, editors, shells.
  • xen: min + Xen hypervisor Dom0 + minimal DomU.
  • prod: big + integration with OAR. The wheezy-x64-prod environment is used on nodes when not using deployment.

As a result, the environments you are the most likely to use are wheezy-x64-min, wheezy-x64-xen, wheezy-x64-base, wheezy-x64-nfs, wheezy-x64-big, and their jessie counterparts. The list of all provided environments is available on the Category:Portal:Environment page, or using kaenv3 -l. Note that environments are versionned, and old versions of reference environments are available in /grid5000/images/ on each frontend. This can be used to reproduce experiments even months or years later, still using the same software environment.

Customizing nodes and accessing the Internet

Now that your nodes are deployed, the next step is usually to copy data (usually using scp or rsync) and install software.

First, connect to the node as root:

Terminal.png fnancy:
ssh root@griffon-42

To access websites outside Grid'5000, you need to use one of Grid'5000 HTTP proxies, named Using the name proxy will connect to the local proxy. For example, to fetch the Linux kernel sources:

Not all websites are authorized. The full list of authorized websites is given on the Access to the Internet page, as well as the procedure to request the addition of a website. For services that don't use HTTP, a NAT service is also available.

For Debian APT repositories, the proxy is already configured in /etc/apt/apt.conf.d/, so you do not need to set the http_proxy variable. Let's install stress (a simple load generator) on the node:

Terminal.png griffon-42:
apt-get install stress

Installing all the software needed for your experiment can be quite time-consuming. There are three approaches to avoid spending time at the beginning of each of your Grid'5000 sessions:

  • Automate the installation of your software environment after the image has been deployed. You can use a simple bash script, or more advanced tools for configuration management such as Puppet or Chef.
  • Register a new environment with your modifications. More details are provided in the Advanced Kadeploy tutorial.
  • Use a tool to generate your environment from a set of rules, such as Puppet (which is also used by the Grid'5000 technical team to generate all Grid'5000 environments with a clean and reproducible process) or Kameleon.

All those approaches have different pros and cons. We recommend that you start by scripting software installation, and that you move to other approaches when this proves too limited.

Monitoring deployed nodes

To limit experiment artefacts to the minimum, monitoring is not activated by default on reference environments. On a *-min image, you will first need to install the ganglia-monitor package :

Terminal.png griffon-45:
apt-get install ganglia-monitor

You need to start the ganglia-monitor service to get meaningful results from the metrology API. You can do this using the following:

Terminal.png griffon-45:
service ganglia-monitor start

Controlling nodes (rebooting, accessing the serial console)

Grid'5000 provides you with out-of-band control of your nodes. You can access the node's serial console, trigger a reboot, or even power off the node. This is very useful in case your node loses network connectivity, or simply crashes.

Using another terminal, connect again to the frontend (fnancy), and then connect to your node's serial console, using:

Terminal.png fnancy:
kaconsole3 -m

As a reminder, the root password for all Grid'5000-provided images is grid5000. At the end of this tutorial, when you will need to exit that console, use the key sequence described in the Kaconsole page. On all clusters press '&', then '.'

Using yet another terminal, connect again to the frontend. Now, shutdown the node, and watch it going down in the console:

Terminal.png fnancy:
kapower3 --off -m

After it has been shut down, check its status, and turn it on again:

Terminal.png fnancy:
kapower3 --status -m
kapower3 --on -m

Alternatively, you could have rebooted the node, using:

Terminal.png fnancy:
kareboot3 --reboot-kind simple -m

Checking nodes with g5k-checks

The Grid'5000 team puts on strong focus on ensuring that nodes meet their advertised capabilities. A detailed description of each node is stored in the Reference API, and the node is frequently checked against this description in order to detect hardware failures or misconfigurations.

On deployed nodes, it is also possible to run this check manually (e.g. at the beginning of your experiments). For installation and execution see G5k-checks.

Going further

In this tutorial, you learned the basics of Grid'5000:

  • The general structure of Grid'5000, and how to move between sites
  • How to manage you data (one NFS server per site; remember: it is not backed up)
  • How to find and reserve resources using OAR and the oarsub command
  • How to get root access on nodes using Kadeploy and the kadeploy3 command

You should now be ready to use Grid'5000.

Additional tutorials

There are many more tutorials available on the Users Home. These tutorials cover more advanced aspects of Grid'5000, such as:

  • using KaVLAN to isolate your experiments at the networking level
  • using the Grid'5000 REST API to script your experiment by automating resources selection, reservation and deployment
  • accessing data from various monitoring systems (power consumption, network, etc.)
  • performing experiments using large amounts of data with Storage5k (storage reservation) and DFS5k (storage cluster deployment)
  • more advanced usage of OAR and Kadeploy

Recommended tips and tricks

There are also several recommended tips and tricks for SSH and related tools, explained in the SSH page:

  • Configure SSH aliases using the ProxyCommand option
  • Using rsync instead of scp (better performance with multiple files)
  • Access your data from your laptop using SSHFS
  • Edit files over SSH with your favorite text editor, with e.g. vim scp://nancy.g5k/my_file.c

There are more in this talk from Grid'5000 School 2010, and this talk more focused on SSH.