Sophia:Hardware: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
Lpouilloux (talk | contribs) No edit summary |
||
Line 12: | Line 12: | ||
* 8 SSDs and 12 HDDs on nodes (total: 50.77 TB) | * 8 SSDs and 12 HDDs on nodes (total: 50.77 TB) | ||
* 45.9 TFLOPS (excluding GPUs) | * 45.9 TFLOPS (excluding GPUs) | ||
{|width="100%" border="0" | |||
|- | |||
| width="50%" valign="top" | | |||
{| width="100%" | |||
=== Default queue resources === | |||
* 1 cluster | |||
* 4 nodes | |||
* 48 CPU cores | |||
* 384 GiB RAM | |||
* 4 HDDs on nodes (total: 1.0 TB) | |||
* 0.6 TFLOPS (excluding GPUs) | |||
|} | |||
| width="50%" valign="top" | | |||
{| width="100%" | |||
=== Production queue resources === | |||
* 1 cluster | |||
* 6 nodes | |||
* 288 CPU cores | |||
* 12 GPUs | |||
* 175104 GPUs cores | |||
* 3.0 TiB RAM | |||
* 6 SSDs and 0 HDDs on nodes (total: 38.41 TB) | |||
* 40.1 TFLOPS (excluding GPUs) | |||
|} | |||
|} | |||
= Clusters summary = | = Clusters summary = | ||
Line 35: | Line 60: | ||
|[[#uvb|uvb]]||||2011-01-04||2011-01-04||4||2||Intel Xeon X5670||6 cores/CPU||x86_64||data-sort-value="103079215104"|96 GiB||data-sort-value="232"|<b>250 GB HDD</b>||data-sort-value="41000"|1 Gbps (SR‑IOV) + 40 Gbps InfiniBand|| | |[[#uvb|uvb]]||||2011-01-04||2011-01-04||4||2||Intel Xeon X5670||6 cores/CPU||x86_64||data-sort-value="103079215104"|96 GiB||data-sort-value="232"|<b>250 GB HDD</b>||data-sort-value="41000"|1 Gbps (SR‑IOV) + 40 Gbps InfiniBand|| | ||
|- | |||
|} | |||
''**: crossed GPUs are not supported by Grid'5000 default environments'' | |||
== Production queue resources == | |||
{|class="wikitable sortable" | |||
|- | |||
!rowspan=2|Cluster | |||
!rowspan=2|Access Condition | |||
!rowspan=2|Date of arrival | |||
!rowspan=2|Manufacturing date | |||
!data-sort-type="number" rowspan=2|Nodes | |||
!colspan=4|CPU | |||
!data-sort-type="number" rowspan=2|Memory | |||
!data-sort-type="number" rowspan=2|Storage | |||
!data-sort-type="number" rowspan=2|Network | |||
!rowspan=2|Accelerators | |||
|- | |||
!data-sort-type="number"|# | |||
!|Name | |||
!data-sort-type="number"|Cores | |||
!|Architecture | |||
|- | |||
|[[#musa|musa]]||<b>[[Grid5000:UsagePolicy#Rules_for_the_production_queue|production]]</b> queue||2025-01-16||2024-12-09||6||2||AMD EPYC 9254||24 cores/CPU||x86_64||data-sort-value="549755813888"|512 GiB||data-sort-value="5961"|<b>6.4 TB SSD</b>||data-sort-value="25000"|25 Gbps ||2 x Nvidia Tesla H100 (94 GiB) | |||
|- | |- | ||
|} | |} | ||
Line 58: | Line 107: | ||
|- | |- | ||
|[[#esterel41|esterel41]]||<b>testing</b> queue||2025-01-25||2024-03-01||1||2||Intel Xeon Gold 6426Y||16 cores/CPU||x86_64||data-sort-value="549755813888"|512 GiB||data-sort-value="3127"|<b>479 GB SSD</b> + 2.88 TB SSD||data-sort-value=" | |[[#esterel41|esterel41]]||<b>testing</b> queue||2025-01-25||2024-03-01||1||2||Intel Xeon Gold 6426Y||16 cores/CPU||x86_64||data-sort-value="549755813888"|512 GiB||data-sort-value="3127"|<b>479 GB SSD</b> + 2.88 TB SSD||data-sort-value="57000"|1 Gbps + 56 Gbps InfiniBand||2 x Nvidia L40 (45 GiB) | ||
|- | |- | ||
|[[#mercantour2|mercantour2]]||<b>testing</b> queue||2025-01-16||2015-09-01||8||2||Intel Xeon E5-2650 v2||8 cores/CPU||x86_64||data-sort-value="274877906944"|256 GiB||data-sort-value="931"|<b>1.0 TB HDD</b>||data-sort-value="41000"|1 Gbps (SR‑IOV) + 40 Gbps InfiniBand|| | |[[#mercantour2|mercantour2]]||<b>testing</b> queue||2025-01-16||2015-09-01||8||2||Intel Xeon E5-2650 v2||8 cores/CPU||x86_64||data-sort-value="274877906944"|256 GiB||data-sort-value="931"|<b>1.0 TB HDD</b>||data-sort-value="41000"|1 Gbps (SR‑IOV) + 40 Gbps InfiniBand|| | ||
|- | |- | ||
|} | |} | ||
Line 103: | Line 150: | ||
* ib0, InfiniBand, configured rate: 40 Gbps, model: Mellanox Technologies MT25408A0-FCC-QI ConnectX, Dual Port 40Gb/s InfiniBand / 10GigE Adapter IC with PCIe 2.0 x8 5.0GT/s Interface, driver: mlx4_core<br /> | * ib0, InfiniBand, configured rate: 40 Gbps, model: Mellanox Technologies MT25408A0-FCC-QI ConnectX, Dual Port 40Gb/s InfiniBand / 10GigE Adapter IC with PCIe 2.0 x8 5.0GT/s Interface, driver: mlx4_core<br /> | ||
* <span style="color:grey">ib1, InfiniBand, model: Mellanox Technologies MT25408A0-FCC-QI ConnectX, Dual Port 40Gb/s InfiniBand / 10GigE Adapter IC with PCIe 2.0 x8 5.0GT/s Interface, driver: mlx4_core - unavailable for experiment</span><br/> | * <span style="color:grey">ib1, InfiniBand, model: Mellanox Technologies MT25408A0-FCC-QI ConnectX, Dual Port 40Gb/s InfiniBand / 10GigE Adapter IC with PCIe 2.0 x8 5.0GT/s Interface, driver: mlx4_core - unavailable for experiment</span><br/> | ||
|- | |||
|} | |||
= Clusters in the [https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg-prod/ production queue] = | |||
== [https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg-prod/?filter=musa%20only musa] == | |||
'''6 nodes, 12 cpus, 288 cores''' ([https://public-api.grid5000.fr/stable/sites/sophia/clusters/musa/nodes.json?pretty=1 json]) | |||
'''Reservation example:''' | |||
{{Term|location=fsophia|cmd=<code class="command">oarsub</code> <code class="replace">-q production</code> <code class="env">-p musa</code> <code>-I</code>}} | |||
'''Max walltime per nodes:''' | |||
* musa-[1-6]: 168h | |||
{| | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''Access condition:''' | |||
| production queue<br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''Model:''' | |||
| ProLiant DL385 Gen11<br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''Manufacturing date:''' | |||
| 2024-12-09<br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''Date of arrival:''' | |||
| 2025-01-16<br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''CPU:''' | |||
| AMD EPYC 9254 (Zen 4), x86_64, 2 CPUs/node, 24 cores/CPU<br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''Memory:''' | |||
| 512 GiB<br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''Storage:''' | |||
| disk0, 6.4 TB SSD NVME Samsung MO006400KYDND (dev: <code class="file">/dev/disk0</code>) (primary disk)<br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''Network:''' | |||
| | |||
* eth0/enp1s0f0np0, Ethernet, configured rate: 25 Gbps, model: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller, driver: bnxt_en<br /> | |||
* <span style="color:grey">eth1/ens22f1np1, Ethernet, model: Broadcom Inc. and subsidiaries BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller, driver: bnxt_en - unavailable for experiment</span><br/> | |||
|- | |||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''GPU:''' | |||
| 2 x Nvidia H100 NVL (94 GiB)<br>Compute capability: 9.0<br/> | |||
|- | |- | ||
|} | |} | ||
Line 147: | Line 240: | ||
* <span style="color:grey">eth2/ens15f2, Ethernet, model: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe, driver: tg3 - unavailable for experiment</span><br /> | * <span style="color:grey">eth2/ens15f2, Ethernet, model: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe, driver: tg3 - unavailable for experiment</span><br /> | ||
* <span style="color:grey">eth3/ens15f3, Ethernet, model: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe, driver: tg3 - unavailable for experiment</span><br /> | * <span style="color:grey">eth3/ens15f3, Ethernet, model: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe, driver: tg3 - unavailable for experiment</span><br /> | ||
* ibs3, InfiniBand, configured rate: | * ibs3, InfiniBand, configured rate: 56 Gbps, model: Mellanox Technologies MT28908 Family [ConnectX-6], driver: mlx5_core<br/> | ||
|- | |- | ||
| valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''GPU:''' | | valign="top" style="background-color: #f9f9f9; padding: 0px 10px 0px 3px;" |'''GPU:''' | ||
Line 227: | Line 320: | ||
* <span style="color:grey">ib1, InfiniBand, model: Mellanox Technologies MT25408A0-FCC-QI ConnectX, Dual Port 40Gb/s InfiniBand / 10GigE Adapter IC with PCIe 2.0 x8 5.0GT/s Interface, driver: mlx4_core - unavailable for experiment</span><br/> | * <span style="color:grey">ib1, InfiniBand, model: Mellanox Technologies MT25408A0-FCC-QI ConnectX, Dual Port 40Gb/s InfiniBand / 10GigE Adapter IC with PCIe 2.0 x8 5.0GT/s Interface, driver: mlx4_core - unavailable for experiment</span><br/> | ||
|- | |- | ||
|}''<small>Last generated from the Grid'5000 Reference API on 2025-02-10 ([https://gitlab.inria.fr/grid5000/reference-repository/commit/361f8f4419 commit 361f8f4419])</small>'' | |||
|}''<small>Last generated from the Grid'5000 Reference API on 2025- |
Revision as of 16:37, 10 February 2025
Hardware: Global | Grenoble | Lille | Louvain | Luxembourg | Lyon | Nancy | Nantes | Rennes | Sophia | Strasbourg | Toulouse
See also: Network topology for Sophia
Summary
- 4 clusters
- 19 nodes
- 496 CPU cores
- 14 GPUs
- 211456 GPUs cores
- 5.88 TiB RAM
- 8 SSDs and 12 HDDs on nodes (total: 50.77 TB)
- 45.9 TFLOPS (excluding GPUs)
Default queue resources
|
Production queue resources
|
Clusters summary
Default queue resources
Cluster | Access Condition | Date of arrival | Manufacturing date | Nodes | CPU | Memory | Storage | Network | Accelerators | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|
# | Name | Cores | Architecture | |||||||||
uvb | 2011-01-04 | 2011-01-04 | 4 | 2 | Intel Xeon X5670 | 6 cores/CPU | x86_64 | 96 GiB | 250 GB HDD | 1 Gbps (SR‑IOV) + 40 Gbps InfiniBand |
**: crossed GPUs are not supported by Grid'5000 default environments
Production queue resources
Cluster | Access Condition | Date of arrival | Manufacturing date | Nodes | CPU | Memory | Storage | Network | Accelerators | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|
# | Name | Cores | Architecture | |||||||||
musa | production queue | 2025-01-16 | 2024-12-09 | 6 | 2 | AMD EPYC 9254 | 24 cores/CPU | x86_64 | 512 GiB | 6.4 TB SSD | 25 Gbps | 2 x Nvidia Tesla H100 (94 GiB) |
**: crossed GPUs are not supported by Grid'5000 default environments
Testing queue resources
Cluster | Access Condition | Date of arrival | Manufacturing date | Nodes | CPU | Memory | Storage | Network | Accelerators | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|
# | Name | Cores | Architecture | |||||||||
esterel41 | testing queue | 2025-01-25 | 2024-03-01 | 1 | 2 | Intel Xeon Gold 6426Y | 16 cores/CPU | x86_64 | 512 GiB | 479 GB SSD + 2.88 TB SSD | 1 Gbps + 56 Gbps InfiniBand | 2 x Nvidia L40 (45 GiB) |
mercantour2 | testing queue | 2025-01-16 | 2015-09-01 | 8 | 2 | Intel Xeon E5-2650 v2 | 8 cores/CPU | x86_64 | 256 GiB | 1.0 TB HDD | 1 Gbps (SR‑IOV) + 40 Gbps InfiniBand |
**: crossed GPUs are not supported by Grid'5000 default environments
Clusters in the default queue
uvb
4 nodes, 8 cpus, 48 cores (json)
Reservation example:
Model: | Dell PowerEdge C6100 |
Manufacturing date: | 2011-01-04 |
Date of arrival: | 2011-01-04 |
CPU: | Intel Xeon X5670 (Westmere), x86_64, 2.93GHz, 2 CPUs/node, 6 cores/CPU |
Memory: | 96 GiB |
Storage: | disk0, 250 GB HDD SATA Western Digital WDC WD2502ABYS-1 (dev: /dev/disk0 ) (primary disk) |
Network: |
|
Clusters in the production queue
musa
6 nodes, 12 cpus, 288 cores (json)
Reservation example:
Max walltime per nodes:
- musa-[1-6]: 168h
Access condition: | production queue |
Model: | ProLiant DL385 Gen11 |
Manufacturing date: | 2024-12-09 |
Date of arrival: | 2025-01-16 |
CPU: | AMD EPYC 9254 (Zen 4), x86_64, 2 CPUs/node, 24 cores/CPU |
Memory: | 512 GiB |
Storage: | disk0, 6.4 TB SSD NVME Samsung MO006400KYDND (dev: /dev/disk0 ) (primary disk) |
Network: |
|
GPU: | 2 x Nvidia H100 NVL (94 GiB) Compute capability: 9.0 |
Clusters in the testing queue
esterel41
1 node, 2 cpus, 32 cores (json)
Reservation example:
Access condition: | testing queue |
Model: | DL380 Gen11 |
Manufacturing date: | 2024-03-01 |
Date of arrival: | 2025-01-25 |
CPU: | Intel Xeon Gold 6426Y (Sapphire Rapids), x86_64, 2 CPUs/node, 16 cores/CPU |
Memory: | 512 GiB |
Storage: |
|
Network: |
|
GPU: | 2 x Nvidia L40 (45 GiB) Compute capability: 8.9 |
mercantour2
8 nodes, 16 cpus, 128 cores, split as follows due to differences between nodes (json)
Reservation example:
- mercantour2-[1-6,
8] (7 nodes, 14 cpus, 112 cores)
Access condition: | testing queue |
Model: | Dell PowerEdge C6220 II |
Manufacturing date: | 2015-09-01 |
Date of arrival: | 2025-01-16 |
CPU: | Intel Xeon E5-2650 v2 (Ivy Bridge), x86_64, 2.60GHz, 2 CPUs/node, 8 cores/CPU |
Memory: | 256 GiB |
Storage: | disk0, 1.0 TB HDD SATA Seagate ST1000NM0033-9ZM (dev: /dev/disk0 ) (primary disk) |
Network: |
|
- mercantour2-7 (1 node, 2 cpus, 16 cores)
Access condition: | testing queue |
Model: | Dell PowerEdge C6220 II |
Manufacturing date: | 2015-09-01 |
Date of arrival: | 2025-01-16 |
CPU: | Intel Xeon E5-2650 v2 (Ivy Bridge), x86_64, 2.60GHz, 2 CPUs/node, 8 cores/CPU |
Memory: | 256 GiB |
Storage: | disk0, 1.0 TB HDD SATA Toshiba TOSHIBA MG03ACA1 (dev: /dev/disk0 ) (primary disk) |
Network: |
|
Last generated from the Grid'5000 Reference API on 2025-02-10 (commit 361f8f4419)