Grid5000:Home: Difference between revisions

Revision as of 10:29, 26 October 2023

Grid'5000

Grid'5000 is a large-scale and flexible testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data and AI.

Key features:

provides access to a large amount of resources: 15000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies: PMEM, GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path
highly reconfigurable and controllable: researchers can experiment with a fully customized software stack thanks to bare-metal deployment features, and can isolate their experiment at the networking layer
advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments
designed to support Open Science and reproducible research, with full traceability of infrastructure and software changes on the testbed
a vibrant community of 500+ users supported by a solid technical team

Read more about our teams, our publications, and the usage policy of the testbed. Then get an account, and learn how to use the testbed with our Getting Started tutorial and the rest of our Users portal.

Grid'5000 is merging with FIT to build the SILECS Infrastructure for Large-scale Experimental Computer Science. Read an Introduction to SILECS (April 2018)

Recently published documents and presentations:

Presentation of Grid'5000 (April 2019)
Report from the Grid'5000 Science Advisory Board (2018)

Older documents:

Grid'5000 is supported by a scientific interest group (GIS) hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations. Inria has been supporting Grid'5000 through ADT ALADDIN-G5K (2007-2013), ADT LAPLACE (2014-2016), and IPL HEMERA (2010-2014).

Current status (at 2025-07-02 01:14): 1 current events, 7 planned (details)

Random pick of publications

Five random publications that benefited from Grid'5000 (at least 2777 overall):

Lucian Trestioreanu, Flaviene Scheidt, Wazen M. Shbair, Jerome Francois, Damien Magoni, et al.. To Squelch or not to Squelch: Enabling Improved Message Dissemination on the XRP Ledger. 37th IEEE/IFIP Network Operations and Management Symposium (NOMS 2024), May 2024, Séoul, South Korea. hal-04621124 view on HAL pdf
Rahma Hellali, Zaineb Chelly Dagdia, Karine Zeitouni. A Multi-Objective Multi-Agent Interactive Deep Reinforcement Learning Approach for Feature Selection. International conference on neural information processing, Dec 2024, Auckland (Nouvelle Zelande), New Zealand. pp.15. hal-04723314 view on HAL pdf
Danilo Carastan-Santos, Georges da Costa, Igor Fontana de Nardin, Millian Poquet, Krzysztof Rzadca, et al.. Scheduling with lightweight predictions in power-constrained HPC platforms. 2024. hal-04747713 view on HAL pdf
Eva Giboulot, Teddy Furon. WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off. NeurIPS 2024 - 38th Conference on Neural Information Processing Systems, Dec 2024, Vancouver, Canada. pp.1-34. hal-04766606 view on HAL pdf
Fatmir Asllanaj, Sylvain Contassot-Vivier, Guilherme C Fraga, Francis H.R. França, Roberta J.C. da Fonseca. New gas radiation model of high accuracy based on the principle of weighted sum of gray gases. Journal of Quantitative Spectroscopy and Radiative Transfer, 2024, 315, pp.108887. 10.1016/j.jqsrt.2023.108887. hal-04375649 view on HAL pdf

Latest news

Cluster "vianden" is now in the default queue in Luxembourg

We are pleased to announce that the vianden[1] cluster of Luxembourg is now available in the default queue.

Vianden is a cluster of a single node with 8 MI300X AMD GPUs.

The node features:

2x CPU Intel Xeon Platinum 8470 (52 cores/CPU, 2 threads/cores)

8x AMD Instinct MI300X (192 GiB each)

2TiB RAM

1x 3.2TB NVME disk

The AMD MI300X GPUs are not supported by Grid'5000 default system (Debian 11). However, one can easily unlock full GPU functionality by deploying the ubuntu2404-rocm environment:

fluxembourg$ oarsub -t exotic -t deploy -p vianden -I

fluxembourg$ kadeploy3 -m vianden-1 ubuntu2404-rocm

More information in the Exotic page.

This cluster was funded by the University of Luxembourg.

[1] https://www.grid5000.fr/w/Luxembourg:Hardware#vianden

-- Grid'5000 Team 11:30, 27 June 2025 (CEST)

Cluster "hydra" is now in the default queue in Lyon

We are pleased to announce that the hydra[1] cluster of Lyon is now available in the default queue.

As a reminder, Hydra is a cluster composed of 4 NVIDIA Grace-Hopper servers[2].

Each node features:

1 Nvidia Grace ARM64 CPU with 72 cores (Neoverse-V2)

1 Nvidia Hopper GPU

512GB LPDDR5 memory

96GB HBM memory

1x1To SSD NVME + 1x1.92To SCSI disk

Due to its bleeding-edge hardware, the usual Grid'5000 environments are not supported by default for this cluster.

(Hydra requires system environments featuring a Linux kernel >= 6.6). The default system on the hydra nodes is based on Debian 11, but **does not provide functional GPUs**. However, users may deploy the ubuntugh2404-arm64-big environment, which is similar to the official Nvidia image provided for this machine and provides GPU support.

To submit a job on this cluster, the following command may be used:

oarsub -t exotic -p hydra

This cluster is funded by INRIA and by Laboratoire de l'Informatique du Parallélisme with ENS Lyon support.

[1] Hydra is the largest of the modern constellations according to Wikipedia: https://en.wikipedia.org/wiki/Hydra_(constellation)

[2] https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture-in-depth/

-- Grid'5000 Team 16:42, 12 June 2025 (CEST)

Cluster "estats" (Jetson nodes in Toulouse) is now kavlan capable

The network topology of the estats Jetson nodes can now be configured, just like for other clusters.

More info in the Network reconfiguration tutorial.

-- Grid'5000 Team 18:25, 21 May 2025 (CEST)

Cluster "chirop" is now in the default queue of Lille with energy monitoring.

Dear users,

We are pleased to announce that the Chirop[1] cluster of Lille is now available in the default queue.

This cluster consists of 5 HPE DL360 Gen10+ nodes with:

2 CPU Intel Xeon Platinum 8358 (32 cores per CPU)

512 GiB memory

1*1.92TB SSD NVME + 2*3.84TB SSD

2*25 Gbps Ethernet interface

Energy monitoring[2] is also available for this cluster[3], provided by newly installed Wattmetres (similar to those already available at Lyon).

This cluster was funded by CPER CornelIA.

[1] https://www.grid5000.fr/w/Lille:Hardware#chirop

[2] https://www.grid5000.fr/w/Energy_consumption_monitoring_tutorial [3] https://www.grid5000.fr/w/Monitoring_Using_Kwollect#Metrics_available_in_Grid.275000

-- Grid'5000 Team 16:25, 05 May 2025 (CEST)

Grid'5000 sites

Current funding

As from June 2008, Inria is the main contributor to Grid'5000 funding.

INRIA

CNRS

Universities

IMT Atlantique
Université Grenoble Alpes, Grenoble INP
Université Rennes 1, Rennes
Institut National Polytechnique de Toulouse / INSA / FERIA / Université Paul Sabatier, Toulouse
Université Bordeaux 1, Bordeaux
Université Lille 1, Lille
École Normale Supérieure, Lyon

Regional councils

Aquitaine
Auvergne-Rhône-Alpes
Bretagne
Champagne-Ardenne
Provence Alpes Côte d'Azur
Hauts de France
Lorraine

@@ Line 39: / Line 39: @@
 ==Latest news==
-<rss max=4 item-max-length="2000">https://www.grid5000.fr/w?title=News&action=feed&feed=atom</rss>
+<rss max=4 item-max-length="2000">https://www.grid5000.fr/rss/G5KNews.php</rss>
 ----
 [[News|Read more news]]