Production:Getting Started
Note | |
---|---|
This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team. |
Getting support
The Support page describes how to get help during your Production usage (common with Grid'5000 usage)
There's also an Production:FAQ page with the most common question related to Production usage.
Connecting for the first time
The primary way to move around Grid'5000 is using SSH. A reference page for SSH is also maintained with advanced configuration options that frequent users will find useful.
As described in the figure below, when using Grid'5000, you will typically:
- connect, using SSH, to an access machine
- connect from this access machine to a site frontend
- on this site frontend, reserve resources (nodes), and connect to those nodes
Connect to a Grid'5000 access machine
To enter the Grid'5000 network from Internet, one must use an access machine: access.grid5000.fr
(Note that access.grid5000.fr
is a round robin alias to either: access-north
which is currently hosted in Lille, or access-south
currently hosted in Sophia-Antipolis).
For all connections, you must use the login
that was provided to you when you created your Grid'5000 account.
You will get authenticated using the SSH public key you provided in the account creation form. Password authentication is disabled.
Connecting to a Grid'5000 site
Grid'5000 is structured in sites
(Grenoble
, Rennes
, Nancy
, ...). Each site hosts one or more clusters (homogeneous sets of machines, usually bought at the same time).
To connect to a particular site, do the following (blue and red arrow labeled SSH in the figure above).
- Home directories
You have a different home directory on each Grid'5000 site, so you will usually use Rsync or scp
to move data around.
On access
machines, you have direct access to each of those home directories, through NFS mounts (but using that feature to transfer very large volumes of data is inefficient). Typically, to copy a file to your home directory on the Nancy site, you can use:
Grid'5000 does NOT have a BACKUP service for users' home directories: it is your responsibility to save important data in someplace outside Grid'5000 (or at least to copy data to several Grid'5000 sites in order to increase redundancy).
Quotas are applied on home directories -- by default, you get 25 GB per Grid'5000 site. If your usage of Grid'5000 requires more disk space, it is possible to request quota extensions in the account management interface, or to use other storage solutions (see Storage).
Alternative Connections
SSH connection through a web interface
If you want an out-of-the-box solution which does not require you to setup SSH, you can connect through a web interface. The interface is available at https://intranet.grid5000.fr/shell/SITE/. For example, to access nancy's site, use: https://intranet.grid5000.fr/shell/nancy/ To connect you will have to type in your credentials twice (first for the HTTP proxy, then for the SSH connection).
This solution is probably suitable to follow this tutorial, but is unlikely to be suitable for real Grid'5000 usage. So you should probably read the next sections about how to setup and use SSH at some point.
VPN
A VPN service is also available, allowing to connect directly to any Grid'5000 machines (bypassing the access machines). See the VPN page for more information.
HTTP reverse proxies
If you only require HTTP/HTTPS access to a node, a reverse HTTP proxy is also available, see the HTTP/HTTPs_access page.
Recommended tips and tricks for an efficient use of Grid'5000
SSH configuration
- Configure SSH aliases using the ProxyCommand option. Using this, you can avoid the two-hops connection (access machine, then frontend) but establish connections directly to frontends. This requires using OpenSSH, which is the SSH software available on all GNU/Linux systems, MacOS, and also recent versions of Microsoft Windows.
Hostg5k
Userlogin
Hostname access.grid5000.fr ForwardAgent no Host*.g5k
Userlogin
ProxyCommand ssh g5k -W "$(basename %h .g5k):%p" ForwardAgent no
Reminder: login
is your Grid'5000 username
Once done, you can establish connections to any machine (first of all: frontends) inside Grid'5000 directly, by suffixing .g5k
to its hostname (instead of first having to connect to an access machine). E.g.:
Bash prompt
It is possible to modify your bash prompt to display useful informations related to your current job, such as its jobid, the reserved nodes and the remaining time.
You will find here all the information you need to setup such a prompt if you are interested.
Miscellaneous
- Use
rsync
instead ofscp
for better performance with multiple files. - For a better bandwidth or latency, you may also be able to connect directly via the local access machine of one of the Grid'5000 sites. Local accesses use
access.
site
.grid5000.fr
instead ofaccess.grid5000.fr
. However, mind that per-site access restrictions are applied: see External access for details about local access machines. - Access your data from your laptop using SSHFS
- Edit files over SSH with your favorite text editor, with e.g.:
Platform usage
Submitting jobs with OAR
- Interactive usage
To reserve a single host (one node) for one hour, in interactive mode, do:
As soon as the resource becomes available, you will be directly connected to the reserved resource with an interactive shell, as indicated by the shell prompt, and you can run commands on the node:
- Reserving only part of a node
To reserve only one CPU core in interactive mode, run:
- Non-interactive usage (scripts)
You can also simply launch your experiment along with your reservation:
Your program will be executed as soon as the requested resources are available. As this type of job is not interactive, you will have to check for its termination using the oarstat
command.
Submission queues
Note | |
---|---|
OAR is the resources and jobs management system (a.k.a batch manager) used in Grid'5000, just like in traditional HPC centers. However, settings and rules of OAR that are configured in Grid'5000 slightly differ from traditional batch manager setups in HPC centers, in order to match the requirements for an experimentation testbed. Please remember to read again Grid'5000 Usage Policy to understand the expected usage. |
It exists different submission queues:
- default used if none are specified
- exotic to use resources with specific hardware (e.g., PMEM, arm, power, ...)
- production only available at Nancy for computation usage
- besteffort to start best-effort jobs
Production queue in Nancy
- Job submission
- Resources restriction per walltime
To make sure that someone requesting only a few nodes, for a small amount of time will be able to get soon enough, the nodes are split into categories. This depends on each cluster and is visible in the Gantt chart or at the Nancy Hardware page. An example of split is:
- 30% -- 24h (1 day)
- 30% -- 48h (2 days)
- 40% -- 168h (one week)
Note that best-effort jobs are excluded from those limitations.
Another OAR feature that could impact the scheduling of your jobs is the OAR scheduling with fair-sharing, which is based on the notion of karma: this feature assigns a dynamic priority to submissions based on the history of submissions by a specific user. With that feature, the jobs from users that rarely submit jobs will be generally scheduled earlier than jobs from heavy users.
Discovering and visualizing resources
There are several ways to learn about the site's resources and their status.
- Site's resources
- The site's MOTD (message of the day) lists all clusters and their features. Additionally, it gives the list of current or future downtimes due to maintenance.
- Hardware pages contain a detailed description of the site's hardware
- Site pages on the wiki (e.g. Nancy:Home) contain a detailed description of the site's hardware and network
- Site's status
- Current or future downtimes due to maintenance are available from https://www.grid5000.fr/status/.
- The Status page links to the resource status on each site, with two different visualizations available:
- Monika, that provides the current status of nodes (see Nancy's current status)
- Gantt, that provides current and planned resources reservations (see Nancy's current status; example in the figure below).
Change default job specifications
In Grid'5000 the smallest unit of resource managed by OAR is the core (cpu core), but by default a OAR job reserves a host (physical computer including all its cpus and cores, and possibly gpus). Hence, what OAR calls nodes are hosts (physical machines). In the oarsub
resource request (-l
arguments), nodes is an alias for host, so both are equivalent. But prefer using host for consistency with other argumnents and other tools that expose host not nodes.
- Other types of resources
To reserve only one GPU (with the associated CPU cores and share of memory) in interactive mode, run:
To reserve several GPUs and ensure they are located in a single node, make sure to specify host=1
:
- Choosing the job duration
Of course, you might want to run a job for a different duration than one hour. The -l
option allows you to pass a comma-separated list of parameters specifying the needed resources for the job, and walltime
is a special resource defining the duration of your job:
The walltime is the expected duration you envision to complete your work. Its format is [hour:min:sec|hour:min|hour]
. For instance:
walltime=5
=> 5 hourswalltime=1:22
=> 1 hour and 22 minuteswalltime=0:03:30
=> 3 minutes and 30 seconds
- Working with more than one node
You will probably want to use more than one node on a given site.
To reserve two hosts (two nodes), in interactive mode, do:
or equivalently (nodes is an alias for host):
You will obtain a shell on the first node of the reservation. It is up to you to connect to the other nodes and distribute work among them.
By default, you can only connect to nodes that are part of your reservation. If you completely own the nodes within one job (or with one job per complete node), you will be able to connect those by using ssh
. In the case of nodes that are not completely owned within a job (if you have reserved only a part of the nodes or by having multiple jobs on nodes) you will have to use oarsh
connector to go from one node to the other. The connector supports the same options as the classical ssh
command, so it can be used as a replacement for software expecting ssh.
Note | |
---|---|
To take advantage of several nodes and distribute work between them, a good option is GNU_Parallel. |
oarsh
is a wrapper around ssh
that enables the tracking of user jobs inside compute nodes (for example, to enforce the correct sharing of resources when two different jobs share a compute node). If your application does not support choosing a different connector, be sure to reserve nodes entirely (which is the default with oarsub
) to be able to use ssh
.
- Selecting nodes from a specific cluster or cluster type
- Reserve nodes from a specific cluster
- Reserve nodes in the production queue
- Reserve nodes from an exotic cluster type
Clusters with the exotic type either have a non-x86 architecture, or are specific enough to warrant this type. Resources with an exotic type are never selected by default by OAR. Using -t exotic
is required to obtain such resources.
The type of a cluster can be identified on the Hardware pages, see for instance Lyon:Hardware.
Warning | |
---|---|
When using the |
- Selecting specific nodes
If you know the exact node you want to reserve, you can specify the hostname of the node you require:
If you want several specific nodes, you can use a list:
- Using OAR properties
The OAR nodes database contains a set of properties for each node, and the -p
option actually filters based on these properties:
- Nodes with Infiniband FDR interfaces:
- Nodes with power sensors and GPUs:
- Nodes with 2 GPUs:
- Nodes with a specific CPU model:
- Since
-p
accepts SQL, you can write advanced queries:
fnancy :
|
oarsub -p "wattmeter=YES AND host NOT IN (graffiti-41, graffiti-42)" -l host=5,walltime=2 -I |
The OAR properties available on each site are listed on the Monika pages linked from Status (example page for Nancy). The full list of OAR properties is available on this page.
Note | |
---|---|
Since this is using a SQL syntax, quoting is important! Use double quotes to enclose the whole query, and single quotes to write strings within the query. |
Advanced job management topics (specific to Grid'5000)
- Reservations in advance
By default, oarsub
will give you resources as soon as possible: once submitted, your request enters a queue. This is good for non-interactive work (when you do not care when exactly it will be scheduled), or when you know that the resources are available immediately.
You can also reserve resources at a specific time in the future, typically to perform large reservations over nights and week-ends, with the -r
parameter:
Note | |
---|---|
Remember that all your resource reservations must comply with the Usage Policy. You can verify your reservations' compliance with the Policy with |
- Extending the duration of a reservation
Provided that the resources are still available after your job, you can extend its duration (walltime) using e.g.:
This will request to add one hour and a half to job 12345.
For more details, see the oarwalltime section of the Advanced OAR tutorial.
Using nodes in the default environment
When you run oarsub
, you gain access to physical nodes with a default (standard) software environment. This is a Debian-based system that is regularly updated by the technical team. It contains many pre-installed software.
Environment module
- `module avail` : list all available modules
- `module load module_name` : load `module_name` and its dependencies
- `module list` : list all loaded modules
- `module purge` : reset all loaded module
fgrenoble:~$ oarsub -I node:~$ python --version node:~$ module load python/3.8.12_gcc-10.2.0 node:~$ python --version
See https://www.grid5000.fr/w/Modules
Guix
Guix is available on frontends and nodes (using standard or deployed -nfs and -big environments) directly through the guix
command.
For example:
$ guix install hello
See https://www.grid5000.fr/w/Guix
Docker
- To use docker:
fgrenoble:~$ oarsub -I node:~$ g5k-setup-docker node:~$ docker run hello-world
- To use GPU from Docker container using NVIDIA Container Toolkit, use `g5k-setup-nvidia-docker` instead of `g5k-setup-dock`.
Singularity
- To use Singularity:
fgrenoble:~$ oarsub -I node:~$ singularity run docker://hello-world
- To use GPU with Singularity:
fnancy:~$ oarsub -q production -I -l gpu=1 node:~$ singularity run --nv docker://tensorflow/tensorflow:latest-gpu
Conda
- Load conda module and activate bash completion
- Create an environment
- Load this environment
- Install tensorflow package
See https://www.grid5000.fr/w/Conda
Becoming root with sudo-g5k
On HPC clusters, users typically don't have root access. However, Grid'5000 allows more flexibility: if you need to install additional system packages or to customize the system, it is possible to become root. The tool to do this is called sudo-g5k.
Note | |
---|---|
Using sudo-g5k has a cost for the platform: at the end of your job, the node needs to be completely reinstalled so that it is clean for the next user. It is best to avoid running sudo-g5k in very short jobs. |
Additional storage
Storage Resource | Backups? | Protocol used | Persistence period | Provisioning mechanism | Network connectivity |
---|---|---|---|---|---|
/home | No | NFS | long-term | Quota + User Acct mgmt | Variable (1Gb/s - 10 Gb/s) |
Group Storage | No | NFS | long-term | Manual | Variable (1Gb/s - 10 Gb/s) |
On node local disks reservation | No | - | medium-term | OAR | - |
/tmp | No | - | short-term (job) | OAR job | - |
Local disks | No | - | short-term (job) | OAR job | - |
More information on storage is available here.
Home directory
The home directory is a network filesystem (NFS): data in your home directory is not actually stored on the node itself, it is stored on a storage server managed by the Grid'5000 team. In particular, it means that all reserved nodes share the same home directory, and it is also shared with the site frontend. For example, you can compile or install software in your home (possibly using pip, virtualenv), and it will be usable on all your nodes.
Note | |
---|---|
The home directory is only shared within a site. Two nodes from different sites will not have access to the same home. |
Group storage
This service provides large storage spaces, possibly shared between multiple Grid'5000 users. Those storage spaces are accessible on Grid'5000 over NFS.
For instance:
- Production queue in Nancy
The data needed for experiments of the production teams is stored on:
- talc-data (3 volumes - talc, talc2 and talc3 - respectively providing 58T + 58T + 71T = 187T of storage space) is a storage server dedicated to the multispeech research team, but compatible with the Group Storage mechanisms.
- talc-data2 (213T of storage space) is a regular Group Storage server, and talc-data
Please remember that those data are hosted on a NFS server that is not recommended for compute usage.
Local storage
Some nodes have additional local disks, see Hardware#Storage for a list of available disks for each cluster.
There are two ways to access these local disks:
- On some clusters, local disks need to be reserved to be accessible. See Disk reservation for a list of these clusters and for documentation on the reservation process.
- On other clusters, local disks can be used directly. In this case, jump directly to Using local disks.
In both cases, the disks are simply provided as raw devices, and it is the responsibility of the user to partition them and create a filesystem. Note that there may still be partitions and filesystems present from a previous job.
/tmp
The /tmp/
directory is stored on a local disk of the node. Use this directory if you need to access data locally.
The size of /tmp is different from to node to node. It is equal to the total size of the (first) local disk minus 75 GB (which is reserved for the operating system).
Deploying your nodes to get root access and create your own experimental environment
Note | |
---|---|
There is a tool, called |
Deploying a system on nodes with Kadeploy
Reserve one node (the deploy
job type is required to allow deployment with Kadeploy):
Start a deployment of the debian11-base
environment on that node (this takes 5 to 10 minutes):
By default, all the nodes of the reservation are deployed. Alternatively, you can use -m
to specify a node (such as -m
gros-42.nancy.grid5000.fr
).
Kadeploy copy your SSH key from ~/.ssh/authorized_keys
to the node's root account after deployment, so that you can connect without password. You may want to use another SSH key with -k
(such as -k
~/custom_authorized_keys
).
Finally, connect to the node as root:
On Grid'5000 reference environments
Grid'5000 reference environments are named accordingly to the following scheme: OS
version
-
architecture
-
variant
.
OS
version
is the OS distribution name and version, for instancedebian11
(Debian 11 "Bullseye", released on 08/2021),ubuntu2004
(Ubuntu 2004 "Focal", released on 04/2020), orcentos8
(Centos 8, clone of RHEL 8, released on 09/2019).variant
defines the set of features included in the environment, as follows:
Variant | OS available | Grid'5000-specific tuning for performance (e.g., TCP buffers for 10 GbE) |
Installed tools | Network storage accessible |
Hypervisor | ||||
---|---|---|---|---|---|---|---|---|---|
Stantard system utilities* |
Common utilities** |
Scientific software available via module |
Packages available via Guix |
Advanced packages*** | |||||
min | Debian | ||||||||
Ubuntu, CentOS, etc. | |||||||||
base | Debian | ||||||||
nfs | Debian | Support for:
- mounting your home, group - using Grid'5000 user account |
|||||||
Ubuntu, CentOS, etc. | |||||||||
big | Debian | ||||||||
xen | Debian | Xen hypervisor Dom0
+ minimal DomU |
* Including SSH server and network drivers.
** Including among others: Python, Ruby, curl, git, vim, etc.
*** Packages for development, system tools, editors and shells.
The list of all supported environments is available by running kaenv3
on any frontend. Note that environments are versioned: old versions can be listed using the kaenv3 -l -s
command and a former version retrieved and used by adding the --env-version
YYYYMMDDHH
option to the kaenv3
or kadeploy3
commands (also see the man
pages). This can be useful to reproduce experiments months or years later, using a previous version of an environment. On some sites, environments exist on different architectures (x86_64
, ppc64le
and aarch64
). The full list can be found in the Advanced Kadeploy page.
The Grid'5000 reference environments are build from recipes using the kameleon
tool from recipes detailing the whole construction process, and updated on a regular basis (see versions). See the Environment creation page for details.
Using efficiently Grid'5000
Until now you have been logging, and submitting jobs manually to Grid'5000. This way of doing is convenient for learning, prototyping, and exploring ideas. But it may quickly become tedious when it comes to performing a set of experiments on a daily basis. In order to be more efficient and user-friendly, Grid'5000 also support more convenient ways of submitting jobs, such as API requests and computational notebooks.
Notebooks
Grid'5000 also supports Jupyter notebooks and Jupyter lab servers. Jupyter lab servers provide you with a simple web interface to submit jobs on Grid'5000 and run python Notebooks. Using notebooks will allow you to track your experiment evolution during your exploratory phase while scripting part of your process.
You can find more information about Jupyter Lab and python notebooks on the Notebooks page.
Scripting libraries
Several scripting libraries built on top of the Grid'5000 API is available:
- Execo offers a Python API for asynchronous control of local or remote, standalone or parallel, unix processes. It is especially well suited for quickly and easily scripting workflows of parallel/distributed operations on local or remote hosts: automate a scientific workflow, conduct computer science experiments, perform automated tests, etc. The core python package is execo. The execo_g5k package provides a set of tools and extensions for the Grid5000 testbed. The execo_engine package provides tools to ease the development of computer sciences experiments.
- Ruby-Cute is a set of Commonly Used Tools for Experiments, or Critically Useful Tools for Experiments, depending on who you ask. It is a library aggregating various Ruby snippets useful in the context of (but not limited to) development of experiment software on distributed systems testbeds such as Grid'5000. Ruby-Cute is structured in different modules. G5K module allows you to communicate with Grid'5000. Net::SSH::Multi module allows the parallel execution of commands in several remote machines using the SSH protocol. TakTuk module is a wrapper of taktuk parallel command executor.
- Funk helps you to find resources for your experiments, by:
- giving you the number of nodes available at a date and for walltime
- finding the slots for a combination of resources and a walltime
- finding the slot with the maximum number of nodes for a period and a walltime
- managing the reservation of the resources
- EnOSlib is a python library that mutualizes common experiment practices especially when dealing with distributed applications deployments. EnOSlib uses different providers to get resources from an infrastructure. For instance, on Grid'5000 one can easily get a physical environment (non-deploy job/deploy job with or without multiple NICs configured) or a virtualized environment (e.g based on kvm virtual machines). Resources are configured using safe parallel actions (based on Ansible Modules) or using on-the-shelf packaged applications (e.g a monitoring stack, a distributed network packets sniffer).
- Dask-jobqueue is a Python library which makes it easy to deploy Dask on common job queuing systems typically found in high performance supercomputers, academic research institutions, and other clusters. It can be used to facilite the passage between different resource managers, based on OAR and Slurm for example.
Going further
In this tutorial, you learned the basics of Grid'5000:
- The general structure of Grid'5000, and how to move between sites
- How to manage you data (one NFS server per site; remember: it is not backed up)
- How to find and reserve resources using OAR and the
oarsub
command - How to get root access on nodes using Kadeploy and the
kadeploy3
command
You should now be ready to use Grid'5000.
Additional tutorials
There are many more tutorials available on the Users Home page. Please have a look at the page to continue learning how to use Grid'5000.