Sophia:Network: Difference between revisions

From Grid5000
Jump to navigation Jump to search
No edit summary
 
(63 intermediate revisions by 13 users not shown)
Line 1: Line 1:
{{Template:Site link|Network}}
{{Template:Site link|Network}}
{{Portal|Network}}
{{Portal|User}}


[[Image:Sophia-network.png|right|thumb|Physical view of the compute network|470px]]
'''See also:''' [[Sophia:Hardware|Hardware description for Sophia]]
= Network =


<!--[[Image:grid5000-sophia-public.png|thumbnail|200px|right|Sophia Network]]-->
= Overview of Ethernet network topology =


== Gigabit Ethernet ==
[[File:SophiaNetwork.svg|600px]]


=== Azur Cluster (IBM e325)===
{{:Sophia:GeneratedNetwork}}
All the nodes are connected to 4 Cisco-3750 switches (gigabit ethernet). The switches  are interconnected using a 32Gbps stack. The nodes are connected to the following switches:


* '''cisco1''' (Master) : azur-1 to azur-15 (15 nodes), 10GB uplink
to the FastIron
* cisco2 : azur-16 to azur-36 (21 nodes)
* cisco3 : azur-37 to azur-55 (19 nodes)
* cisco4 : azur-56 to azur-72 (17 nodes)


The switches are dedicated to Grid5000 (since january 2007; before that, they were shared with the local cluster).
= Old information (possibly outdated) =


You can choose the switch for your reservation with OAR, using the switch name:
[[Image:Sophia_modified.jpg|center|thumb|Physical view of the compute network|600px]]
<code class="command">oarsub -l nodes=XX -p "switch='ciscoX'"</code>
If you want XX nodes on a single switch, use
<code class="command">oarsub -l /switch=1/nodes=XX </code>


Network usage is available through cricket at : https://helpdesk.grid5000.fr/cricket/grapher.cgi?target=%2Fsophia%2Fcisco
== Gigabit Ethernet ==


OS used:
<code>
Cisco IOS Software, C3750 Software (C3750-IPBASE-M), Version 12.2(25)SEB4</code>


=== Helios Cluster (Sun X4100) ===
=== Suno Cluster (Dell R410) ===
All the nodes are connected in gigabit ethernet to 2 Foudry EdgeIron switches (one Edgeiron 48GS and one EdgeIron24GS). The switches are interconnected using a 2x10Gbps stack (10GBps in each direction). The nodes are connected to the following switches:


* edgeiron48gs : helios01 to helios48 (48 nodes)
All the nodes are connected (in a non blocking way) to the main Foundry FastIron Super X switch.
* edgeiron24gs : helios49 to helios56 (8 nodes)


You can choose the switch for your reservation with OAR, using the switch name: <code class="command">oarsub -l nodes=15 -p "switch='XXX'"</code>
=== UVB Cluster  ===


OS used: <code>2.2.6.35</code>
All the nodes are connected to two stacked Dell PowerConnect 6248 switches. The Powerconnect stack is connected to the  Foundry FastIron Super X through a 10Gbps link.


=== Sol Cluster (Sun X2200 M2) ===
=== Topology ===


All the nodes are connected (in a non blocking way) to the main Foundry FastIron Super X switch.  
The main switch is a Foundry FastIron Super X. It has 2 dual 10GB modules, 4 modules with 24 gigabit ports, and 12 gigabit ports on the management module (so 60 gigabit ports are available). 4 slots are currently free.


47 nodes (from sol-1 to sol-47) have a second Ethernet interface connected to 2 Cisco 3750  stacked switches.
== High Performance networks ==
 
=== Topology ===


The main switch is a Foundry FastIron Super X. It has 2 dual 10GB modules, 2 modules with 24 gigabit ports, and 12 gigabit ports on the management module (so 60 gigabit ports are available). 4 slots are currently free.
=== Infiniband 40G on uvb ===
uvb cluster nodes are all connected to 40G infiniband switches. Since these two clusters are shared with the Nef procution cluster at INRIA Sophia, we are using Infiniband partitions to isolate the nodes from nef when they are available on grid5000. The partition dedicated to grid5000 is '''0x8100'''. The ipoib interfaces on nodes are therefore named ib0.8100 instead of ib0.  
To use the native openib driver of '''openmpi''', you must set: <code>btl_openib_pkey = 0x8100 </code>


The Cisco 3750 stack (Azur) is connected to the FastIron switch with a 10GB link; The EdgeIron stack (helios) is connected to the FastIron with 2 aggregated 10GB links.
==== Nodes ====


Each frontend is connected to the FastIron switch.
*<code class="host">uvb-1</code> to <code class="host">uvb-44</code> have one QDR Infiniband card.
* Card Model : Mellanox Technologies MT26428 [ConnectX IB QDR, PCIe 2.0 5GT/s].
* Driver : <code class="dir">mlx4_ib</code>
* OAR property : ib_rate=40
* IP over IB addressing :
** <code class="host">uvb-[1..44]-ib0</code>.sophia.grid5000.fr ( 172.18.132.[1..44] )


Network usage is available through cricket at:
==== Switch ====
https://helpdesk.grid5000.fr/cricket/grapher.cgi?target=%2Fsophia%2Fgw-sophia.sophia.grid5000.fr


== Myrinet ==
* three Mellanox IS50xx QDR Infiniband Switchs
* Topology available here : https://wiki.inria.fr/ClustersSophia/Network (uvb are nef084-nef140 on the nef production cluster)


72 azur nodes and 56  helios nodes are connected to the 128 ports Myrinet 2000 switch.
==== Interconnection ====


You can submit jobs with OAR on Myrinet nodes using the 'myrinet' property: <code class="command">oarsub -l nodes=10 -p "myrinet='yes'"</code>.
Infiniband network is physically isolated from Ethernet networks. Therefore, Ethernet network emulated over Infiniband is isolated as well. There isn't any interconnexion, neither at the data link layer nor at the network layer.


Myrinet network usage is available through cricket at: https://helpdesk.grid5000.fr/cricket/
==  Architecture ==
[[Image:Sophia-cluster_modified.png|center|thumb|view of the Baie|600px]]

Latest revision as of 14:49, 12 March 2024

See also: Hardware description for Sophia

Overview of Ethernet network topology

SophiaNetwork.svg


Network devices models

  • gw: Foundry
  • swx1nef: Juniper EX2500-24F-FB
  • swy1nef: Dell PowerConnect 6248

More details (including address ranges) are available from the Grid5000:Network page.


Old information (possibly outdated)

Physical view of the compute network

Gigabit Ethernet

Suno Cluster (Dell R410)

All the nodes are connected (in a non blocking way) to the main Foundry FastIron Super X switch.

UVB Cluster

All the nodes are connected to two stacked Dell PowerConnect 6248 switches. The Powerconnect stack is connected to the Foundry FastIron Super X through a 10Gbps link.

Topology

The main switch is a Foundry FastIron Super X. It has 2 dual 10GB modules, 4 modules with 24 gigabit ports, and 12 gigabit ports on the management module (so 60 gigabit ports are available). 4 slots are currently free.

High Performance networks

Infiniband 40G on uvb

uvb cluster nodes are all connected to 40G infiniband switches. Since these two clusters are shared with the Nef procution cluster at INRIA Sophia, we are using Infiniband partitions to isolate the nodes from nef when they are available on grid5000. The partition dedicated to grid5000 is 0x8100. The ipoib interfaces on nodes are therefore named ib0.8100 instead of ib0. To use the native openib driver of openmpi, you must set: btl_openib_pkey = 0x8100

Nodes

  • uvb-1 to uvb-44 have one QDR Infiniband card.
  • Card Model : Mellanox Technologies MT26428 [ConnectX IB QDR, PCIe 2.0 5GT/s].
  • Driver : mlx4_ib
  • OAR property : ib_rate=40
  • IP over IB addressing :
    • uvb-[1..44]-ib0.sophia.grid5000.fr ( 172.18.132.[1..44] )

Switch

Interconnection

Infiniband network is physically isolated from Ethernet networks. Therefore, Ethernet network emulated over Infiniband is isolated as well. There isn't any interconnexion, neither at the data link layer nor at the network layer.

Architecture

view of the Baie