Advanced KaVLAN: Difference between revisions
(83 intermediate revisions by 15 users not shown) | |||
Line 4: | Line 4: | ||
{{Portal|Network}} | {{Portal|Network}} | ||
{{Status|In_production}} | {{Status|In_production}} | ||
{{ | {{Pages|KaVLAN}} | ||
{{TutorialHeader}} | |||
__FORCETOC__ | __FORCETOC__ | ||
= Overview = | = Overview = | ||
The goal of Kavlan is to provide network isolation for Grid'5000 users. KaVLAN allow users to manage VLAN on their Grid'5000 nodes. The benefits is complete level 2 isolation. It can be used together with OAR and Kadeploy to do some experimentations on the | The goal of Kavlan is to provide network isolation for Grid'5000 users. KaVLAN allow users to manage VLAN on their Grid'5000 nodes. The benefits is complete level 2 isolation. It can be used together with OAR and Kadeploy to do some experimentations on the platform. | ||
The first step is to read the [[KaVLAN]] introduction. | The first step is to read the [[KaVLAN]] introduction to understand what kind of VLANs you can configure. | ||
If you want a more concrete example of what you can do with VLANs on Grid'5000, you can go through the [[Network_reconfiguration_tutorial]]. | |||
= Reserve VLANs and deploy nodes inside = | |||
In the first part of the tutorial, we will use kadeploy and kavlan together on a single site, with a routed vlan (we could also use a local vlan). | |||
Once connected on a frontend, in order to obtain nodes and a VLAN you must reserve a kavlan resource with <code class="command">oarsub</code>. There are 3 kinds of resources: '''kavlan''', '''kavlan-local''', '''kavlan-global'''. Here, we will use 3 nodes and a routed VLAN, let's say in Sophia on cluster Suno: | |||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l {"type='kavlan'"}/vlan=1,{"cluster='<code class="replace">suno</code>'"}/nodes=3 -I }} | |||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l {"type='kavlan'"}/vlan=1,{"cluster='suno'"}/nodes=3 -I }} | |||
A shell is now opened on the frontend (like any regular deploy job) | A shell is now opened on the frontend (like any regular deploy job) | ||
You can get the id of your VLAN using the <code class="command">kavlan</code> command | You can get the id of your VLAN using the <code class="command">kavlan</code> command: | ||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V}} | {{Term|location=frontend|cmd=<code class="command">kavlan</code> -V}} | ||
Line 31: | Line 28: | ||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V -j <code class="replace">JOBID</code>}} | {{Term|location=frontend|cmd=<code class="command">kavlan</code> -V -j <code class="replace">JOBID</code>}} | ||
You should get an integer in the <4-9> range for this routed VLAN ( the range for local vlan is <1-3>, and there is one global VLAN per OAR server). | You should get an integer in the <4-9> range for this routed VLAN (the range for local vlan is <1-3>, and there is one global VLAN per OAR server, i.e. one per site). | ||
For our example, let's say we got suno-2, suno-30 and suno-31, and kavlan #4. | |||
You can get all the options of the command using --help: | You can get all the options of the command using --help: | ||
Line 61: | Line 60: | ||
Once you have a kavlan reservation running, you can put your nodes in your VLAN (and back into the default VLAN) at anytime during the lifetime of your job; we will not use this for now. | Once you have a kavlan reservation running, you can put your nodes in your VLAN (and back into the default VLAN) at anytime during the lifetime of your job; we will not use this for now. | ||
Instead we will change the | Instead we will change the VLAN with kadeploy directly. The next step is to deploy the nodes with an environment image, for instance debian11-x64-big. | ||
Line 73: | Line 72: | ||
== Deploy nodes and change VLAN in one step == | == Deploy nodes and change VLAN in one step == | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f $OAR_NODEFILE -k -e debian11-big --vlan `kavlan -V`}} | |||
Once the deployment is done, you will be able to connect on your nodes. They are now inside the VLAN, therefore they are not reachable with their default IP: | |||
{{Term|location=frontend|cmd=<code class="command">ping</code> <code class="replace">suno-30</code> -c1}} | |||
<pre> | |||
PING suno-30.sophia.grid5000.fr (172.16.130.30) 56(84) bytes of data. | |||
From fsophia.sophia.grid5000.fr (172.16.143.106) icmp_seq=1 Destination Host Unreachable | |||
--- suno-30.sophia.grid5000.fr ping statistics --- | |||
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms | |||
</pre> | |||
You can get the list of new hostnames of your nodes in the VLAN with <code class='command'>kavlan -l</code>. For the next part of this tutorial, let's create a nodefile and copy it on the first node: | |||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -l | <code class="command">tee</code> nodefile}} | |||
<pre> | |||
suno-2-kavlan-4.sophia.grid5000.fr | |||
suno-30-kavlan-4.sophia.grid5000.fr | |||
suno-31-kavlan-4.sophia.grid5000.fr | |||
</pre> | |||
{{Term|location=frontend|cmd=<code class="command">scp</code> nodefile root@`head -1 < nodefile`:/tmp}} | |||
The password | The password for user root on Grid'5000 environments is "grid5000". | ||
== | You can see that you can ping these new hostnames: | ||
{{Term|location=frontend|cmd=<code class="command">ping</code> <code class="replace">suno-30-kavlan-4</code> -c1}} | |||
<pre> | |||
PING suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30) 56(84) bytes of data. | |||
64 bytes from suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30): icmp_seq=1 ttl=63 time=0.151 ms | |||
--- suno-30-eth0-kavlan-4.sophia.grid5000.fr ping statistics --- | |||
1 packets transmitted, 1 received, 0% packet loss, time 0ms | |||
rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms | |||
</pre> | |||
= Setup a DHCP server on your nodes = | |||
== Configure DHCP == | |||
If you need to run your own DHCP server (for example if you want to run a cluster distribution inside kavlan or test kadeploy ), you can use a script to generate the configuration file: | |||
Then, go back the the frontend, and download the script that will generate your dhcp configuration: | |||
Create this file (gen_dhcpd_conf.rb) on the frontend : | |||
#!/usr/bin/ruby | |||
# Author: Nicolas Niclausse | |||
# Copyright 2010-2011: INRIA | |||
# script specific to grid5000: | |||
# generate dhcpd config files for kavlan | |||
require 'rubygems' | |||
require 'restfully' # gem install restfully --source http://gemcutter.org | |||
require 'ip' # gem install ruby-ip | |||
require 'getoptlong' | |||
require 'optparse' | |||
require 'ostruct' | |||
headers = "ddns-update-style none; | |||
option space pxelinux; | |||
option pxelinux.magic code 208 = string; | |||
option pxelinux.configfile code 209 = text; | |||
option pxelinux.pathprefix code 210 = text; | |||
option pxelinux.reboottime code 211 = unsigned integer 32; | |||
option vendorinfo code 43 = string; | |||
" | |||
conf = File.expand_path('~/.restfully/api.grid5000.fr.yaml') | |||
options = if FileTest.exists?(conf) then YAML.load_file(conf) else {} end | |||
options[:base_uri] = 'https://api.grid5000.fr/stable/grid5000' | |||
def parseopts(args) | |||
options = OpenStruct.new | |||
options.debug = false | |||
options.verbose = false | |||
options.quiet = false | |||
options.nodes = [] | |||
opts = OptionParser.new do |opts| | |||
opts.banner = "Usage: gen_dhcpd_conf.rb [options]" | |||
opts.separator "" | |||
opts.separator "Specific options:" | |||
opts.on("-s","--site SITE", "generate only DHCP conf for site SITE") do |site| | |||
options.site = site | |||
end | |||
opts.on("-i","--vlan-id N", Integer , "generate only DHCP conf for vlan N") do |vlan| | |||
options.vlan = vlan | |||
end | |||
opts.on("-q", "--[no-]quiet", "Run quietly") do |q| | |||
options.quiet = q | |||
end | |||
opts.on("-v", "--[no-]verbose", "Run verbosely") do |v| | |||
options.verbose = v | |||
end | |||
opts.on_tail("-h", "--help", "Show this message") do | |||
puts opts | |||
exit | |||
end | |||
end | |||
opts.parse!(args) | |||
options | |||
end | |||
$opts = parseopts(ARGV) | |||
Restfully::Session.new(options) do |root, session| | |||
options = {:query => {:version => root['version']}} | |||
root.sites(options).each do |site| | |||
mysite=site['uid'] | |||
next if not $opts.site.nil? and mysite != $opts.site | |||
# optionaly, read mac address from external yaml file | |||
ref = if FileTest.exists?(mysite+".yaml") then | |||
YAML.load_file(mysite+".yaml") | |||
else | |||
puts mysite +": no yaml file for macs" unless $opts.quiet | |||
{} | |||
end | |||
if $opts.vlan.nil? then | |||
vlans = (1..9).to_a | |||
# try to guess global vlan assigned to current site | |||
(10..21).each do |gvlan| | |||
begin | |||
IPSocket::getaddress("gw-kavlan-"+gvlan.to_s+"."+mysite+".grid5000.fr") | |||
puts "global vlan found for site %s: " % mysite unless $opts.quiet | |||
vlans.push(gvlan) | |||
rescue | |||
next | |||
end | |||
end | |||
else | |||
vlans = [$opts.vlan] | |||
end | |||
vlans.each do |vlan| | |||
filename = "dhcpd-kavlan-"+vlan.to_s+"-"+mysite+".conf" | |||
open(filename, 'w') do |f| | |||
puts "generating "+filename unless $opts.quiet | |||
f.puts headers | |||
begin | |||
gateway = IPSocket::getaddress("gw-kavlan-"+vlan.to_s+"."+mysite+".grid5000.fr") | |||
rescue | |||
puts "WARN: Get address error: probably no kavlan DNS setup for site " + mysite + " , skip" if $opts.verbose; | |||
next | |||
end | |||
# /20 for local vlans (1..3) and /18 for routed vlan (4..9) | |||
if vlan < 4 | |||
ip = IP.new(gateway+"/20") | |||
ns = gateway | |||
ntp = gateway | |||
tftp = gateway | |||
else | |||
ip = IP.new(gateway+"/18") | |||
ntp = IPSocket::getaddress("ntp."+mysite+".grid5000.fr") | |||
ns = IPSocket::getaddress("dns."+mysite+".grid5000.fr") | |||
tftp = IPSocket::getaddress("kadeploy-server."+mysite+".grid5000.fr") | |||
end | |||
netmask = ip.netmask.to_addr | |||
broadcast = ip.broadcast.to_addr | |||
network = ip.network.to_addr | |||
f.puts "subnet %s netmask %s {" % [network , netmask] | |||
f.puts " default-lease-time 86400; | |||
max-lease-time 604800;" | |||
#f.puts " option domain-name \"%s.grid5000.fr\"; " % mysite | |||
f.puts " option domain-name-servers %s;" % ns | |||
f.puts " option ntp-servers %s; " % ntp | |||
f.puts " option routers %s;" % gateway | |||
f.puts " option subnet-mask %s; " % netmask | |||
f.puts " option broadcast-address %s;" % broadcast | |||
f.puts " filename \"pxelinux.0\";" | |||
f.puts " next-server %s;" % tftp | |||
sites_for_vlan = if vlan < 10 | |||
[ site ] | |||
else | |||
root.sites(options) | |||
end | |||
sites_for_vlan.each do |currentsite| | |||
currentsite.clusters(options).each do |cluster| | |||
cluster.nodes(options).each do |node| | |||
sitename=currentsite['uid'] | |||
device = node['network_adapters'].find{|s| s['network_address'] =~ /^\w+-\d+\.\w+\.grid5000\.fr/} | |||
next if device.nil? | |||
hostname = device['network_address'] | |||
next if hostname.nil? | |||
hostname_vlan = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1+"-kavlan-"+vlan.to_s+$2} | |||
shortname_vlan = hostname_vlan.gsub(/^(\w+-\d+-\w+-\d+)(\..*)$/){$1} | |||
shortname = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1} | |||
realsite = hostname.split(".")[1] | |||
begin | |||
vlan_ip = IPSocket::getaddress(hostname_vlan) | |||
rescue | |||
= | puts "WARN: Get address error: probably no DNS setup for vlan " +vlan.to_s+" on site " + sitename + " , skip" if $opts.verbose; | ||
next | |||
end | |||
if device['mac'].nil? then | |||
if ref[shortname].nil? then | |||
puts "WARN: mac undefined for host %s, skip" % hostname unless $opts.quiet | |||
next | |||
else | |||
mac = ref[shortname]['mac_eth0'] | |||
end | |||
else | |||
mac = device['mac'] | |||
end | |||
f.puts " host %s {" % hostname_vlan | |||
f.puts " hardware ethernet %s; " % mac | |||
f.puts " option host-name \"%s\";" % shortname_vlan | |||
f.puts " option domain-name \"%s\.grid5000.fr\";" % realsite | |||
f.puts " fixed-address %s;" % vlan_ip | |||
if vlan > 9 | |||
# for global vlan, we need the local tftp server | |||
currenttftp = IPSocket::getaddress("kadeploy-server."+sitename+".grid5000.fr") | |||
f.puts " next-server %s;" % currenttftp | |||
end | |||
f.puts " }" | |||
end | |||
end | |||
end | |||
f.puts "}" | |||
end | |||
end | |||
end | |||
end | |||
(this script use <code>restfully</code> and <code>ruby-ip</code> gems) | (this script use <code>restfully</code> and <code>ruby-ip</code> gems) | ||
Then, generate the configuration (replace <code class="replace">VLANID</code> and <code class="replace">SITE</code> by your current site and VLAN id), and copy it on the node: | |||
Then, generate the configuration (replace <code class="replace">VLANID</code> and <code class="replace">SITE</code> by your current site and | |||
{{Term|location=frontend|cmd=<code class="command">chmod</code> +x ./gen_dhcpd_conf.rb}} | {{Term|location=frontend|cmd=<code class="command">chmod</code> +x ./gen_dhcpd_conf.rb}} | ||
{{Term|location=frontend|cmd=<code class="command">gem</code> install ruby-ip --no-ri --no-rdoc --user-install}} | {{Term|location=frontend|cmd=<code class="command">gem</code> install ruby-ip restfully --no-ri --no-rdoc --user-install}} | ||
{{Term|location=frontend|cmd=<code class="command">./gen_dhcpd_conf.rb</code> --site <code class="replace">SITE</code> --vlan-id <code class="replace">VLANID</code>}} | {{Term|location=frontend|cmd=<code class="command">./gen_dhcpd_conf.rb</code> --site <code class="replace">SITE</code> --vlan-id <code class="replace">VLANID</code>}} | ||
{{Term|location=frontend|cmd=<code class="command">scp</code> dhcpd-kavlan-<code class="replace">VLANID</code>-<code class="replace">SITE</code>.conf root@`head -1 < nodefile`: | {{Term|location=frontend|cmd=<code class="command">scp</code> dhcpd-kavlan-<code class="replace">VLANID</code>-<code class="replace">SITE</code>.conf root@`head -1 < nodefile`:}} | ||
For user accounts, you need to specify your GEM_HOME directory because in the classical one, you won't be able to install the "ruby-ip". To make it possible, type : | For user accounts, you need to specify your GEM_HOME directory because in the classical one, you won't be able to install the "ruby-ip". To make it possible, type : | ||
{{Term|location=frontend|cmd=<code class="command">export</code> GEM_HOME=~/.gem/ruby/ | {{Term|location=frontend|cmd=<code class="command">export</code> GEM_HOME=~/.gem/ruby/2.3.0/}} | ||
You have to disable the default DHCP server of the VLAN: | You have to disable the default DHCP server of the VLAN: | ||
On the frontend {{Term|location=frontend|cmd=<code class="command">kavlan -d</code>}} | On the frontend {{Term|location=frontend|cmd=<code class="command">kavlan -d</code>}} | ||
Now you have to install a DHCP server on the node (we assume the node is not yet | Now you have to install a DHCP server on the node (we assume the node is not yet in the job VLAN, or the vlan is routed and have access to internet for apt): | ||
{{Term|location=node|cmd=<code class="command">apt-get</code> install | {{Term|location=node|cmd=<code class="command">apt-get</code> install isc-dhcp-server}} | ||
There may be an error after the installation : It's normal, you need to tell the DHCP server on which interface to listen to DHCP requests (replace "eno1" with the name of the interface on which the server should listen) | |||
{{Term|location=node|cmd=<code class="command">sed</code> -i s/INTERFACESv4=\"\"/INTERFACESv4=\"eno1\"/g /etc/default/isc-dhcp-server}} | |||
You can now copy the generated configuration file and start the DHCP server : | |||
{{Term|location=node|cmd=<code class= | {{Term|location=node|cmd=<code class="command">cp</code> /root/dhcpd-kavlan-<code class="replace">VLANID</code>-<code class="replace">SITE</code>.conf /etc/dhcp/dhcpd.conf}} | ||
{{Term|location=node|cmd=/etc/init.d/isc-dhcp-server restart}} | |||
Then, in another shell, connect as root on a second node (or use kaconsole): | Then, in another shell, connect as root on a second node (or use kaconsole): | ||
Line 182: | Line 325: | ||
And restart the network configuration: | And restart the network configuration: | ||
{{Term|location=node-dhcp-client|cmd=<code class='command'>/ | {{Term|location=node-dhcp-client|cmd=<code class='command'>systemctl restart</code> networking}} | ||
{{Term|location=node-dhcp-client|cmd=<code class='command'>systemctl status</code> networking}} | |||
... | ... | ||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67 | |||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPDISCOVER on eno1 to 255.255.255.255 port 67 interval 10 | |||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67 | |||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPOFFER of 10.32.3.7 from 10.32.3.6 | |||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPACK of 10.32.3.7 from 10.32.3.6 | |||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPOFFER of 10.32.3.7 from 10.32.3.6 | |||
DHCPACK from 10.32.3.6 | Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPACK of 10.32.3.7 from 10.32.3.6 | ||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: bound to 10.32.3.7 -- renewal in 34620 seconds. | |||
Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: Sending network state change signal to nslcd...done. | |||
bound to 10.32.3.7 -- renewal in | Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr systemd[1]: Started Raise network interfaces. | ||
on the dhcp server, check the logs: | on the dhcp server, check the logs: | ||
Line 199: | Line 344: | ||
{{Term|location=node-dhcp-server|cmd=<code class='command'>tail</code> /var/log/daemon.log}} | {{Term|location=node-dhcp-server|cmd=<code class='command'>tail</code> /var/log/daemon.log}} | ||
... | ... | ||
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPDISCOVER from 00:26:b9:3f:43:a1 via | Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPDISCOVER from 00:26:b9:3f:43:a1 via eno1 | ||
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPOFFER on 10.32.3.7 to 00:26:b9:3f:43:a1 via | Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPOFFER on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1 | ||
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPREQUEST for 10.32.3.7 (10.32.3.6) from 00:26:b9:3f:43:a1 via | Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPREQUEST for 10.32.3.7 (10.32.3.6) from 00:26:b9:3f:43:a1 via eno1 | ||
Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPACK on 10.32.3.7 to 00:26:b9:3f:43:a1 via | Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPACK on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1 | ||
In the four last lines, you see that your own dhcp server has given an address to the other node. | In the four last lines, you see that your own dhcp server has given an address to the other node. | ||
Line 210: | Line 355: | ||
For your information, if you need to do a PXE boot, you must change the tftp server in the generated dhcpd configuration file: | For your information, if you need to do a PXE boot, you must change the tftp server in the generated dhcpd configuration file: | ||
{{Term|location=node|cmd=IP=`hostname -i`}} | {{Term|location=node|cmd=IP=`hostname -i`}} | ||
{{Term|location=node|cmd=<code class='command'>perl</code> -i -pe "s/next-server .*/next-server $IP;/" /etc/ | {{Term|location=node|cmd=<code class='command'>perl</code> -i -pe "s/next-server .*/next-server $IP;/" /etc/dhcp/dhcpd.conf}} | ||
(if there is no <code>next-server</code> configured, you must edit the file by hand and add a line like this: | (if there is no <code>next-server</code> configured, you must edit the file by hand and add a line like this: | ||
next-server XX.XX.XX.XX ; | next-server XX.XX.XX.XX ; | ||
Line 227: | Line 372: | ||
First, we will use taktuk to install <code class='command'>at</code> on all nodes, then the taktuk command will simply launch the network reconfiguration in one minute. Finally, we set the VLAN of all our nodes. | First, we will use taktuk to install <code class='command'>at</code> on all nodes, then the taktuk command will simply launch the network reconfiguration in one minute. Finally, we set the VLAN of all our nodes. | ||
As we will change the network configuration of nodes, we will use an isolated kavlan (a.k.a. [[KaVLAN#1:_Isolated_VLAN|kavlan-local]]) to not interfer with the rest of Grid'5000 network. | |||
{{Term|location=frontend|cmd=<code class="command"> | {{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l {"type='kavlan-local'"}/vlan=1,walltime=2 -I}} | ||
{{Term|location=frontend|cmd=<code class="command">taktuk</code> -s -l root -f | {{Term|location=frontend|cmd=<code class="command">kavlan</code> -V | <code class="command">tee</code> myvlan}} | ||
{{Term|location=frontend|cmd=<code class="command">taktuk</code> -s -l root -f | {{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -l nodes=2 -I}} | ||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -s}} | {{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -e debian11-x64-base -k -f $OAR_FILE_NODES}} | ||
{{Term|location=frontend|cmd=<code class="command">taktuk</code> -s -l root -f $OAR_FILE_NODES broadcast exec [ "apt-get update; apt-get --yes install at" ]}} | |||
{{Term|location=frontend|cmd=<code class="command">taktuk</code> -s -l root -f $OAR_FILE_NODES broadcast exec [ "echo '/etc/init.d/networking restart'| at now + 1 minute " ]}} | |||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -i `cat myvlan` -s -f $OAR_FILE_NODES}} | |||
All nodes are configured in the vlan 2. | |||
In one minute, your nodes will renegotiate their IP addresses and will be available inside the VLAN, you can connect to each of them using kaconsole or ssh (as we use a kavlan-local, you must connect to the gateway of that kavlan first): | |||
{{Term|location=frontend|cmd=<code class="command">ssh</code> kavlan-`cat myvlan`}} | |||
{{Term|location=frontend|cmd=<code class="command"> | {{Term|location=kavlan-VLANID|cmd=<code class="command">ssh</code> root@<code class="replace">suno-30</code>-kavlan-`cat myvlan`}} | ||
{{Term|location= | |||
You can use the <code class='command'>ip neigh</code> command to see the known hosts in your LAN; you should only see IPs in the 192.168.66.0/24 subnet | You can use the <code class='command'>ip neigh</code> command to see the known hosts in your LAN; you should only see IPs in the 192.168.66.0/24 subnet | ||
Line 256: | Line 392: | ||
{{Term|location=node|cmd=<code class="command">ip</code> neigh}} | {{Term|location=node|cmd=<code class="command">ip</code> neigh}} | ||
192.168. | 192.168.223.254 dev eno1 lladdr 02:00:00:00:01:02 REACHABLE | ||
You should be able to ping another of your host inside your VLAN | You should be able to ping another of your host inside your VLAN | ||
{{Term|location=node|cmd=<code class="command">ping</code> -c 3 | {{Term|location=node|cmd=<code class="command">ping</code> -c 3 suno-42-kavlan-2}} | ||
64 bytes from 192.168.211.42: icmp_req=1 ttl=64 time=0.141 ms | |||
64 bytes from 192.168.211.42: icmp_req=2 ttl=64 time=0.166 ms | |||
64 bytes from | 64 bytes from 192.168.211.42: icmp_req=3 ttl=64 time=0.165 ms | ||
64 bytes from | |||
64 bytes from | --- suno-42-kavlan-2.sophia.grid5000.fr ping statistics --- | ||
--- | |||
3 packets transmitted, 3 received, 0% packet loss, time 1998ms | 3 packets transmitted, 3 received, 0% packet loss, time 1998ms | ||
rtt min/avg/max/mdev = 0. | rtt min/avg/max/mdev = 0.141/0.157/0.166/0.015 ms | ||
== Put your nodes back into the default VLAN == | == Put your nodes back into the default VLAN == | ||
First, get put the list of your nodes name with vlan in a file: | First, get put the list of your nodes name with vlan in a file: | ||
{{Term|location=frontend|cmd=<code class='command'> | {{Term|location=frontend|cmd=<code class='command'>uniq </code> $OAR_NODEFILE > mynodes}} | ||
{{Term|location=frontend|cmd=<code class='command'>sed</code> "s/\([^.]*\)\(.*\)/\1-kavlan-`cat myvlan`\2/" mynodes > mynodes-vlan}} | |||
Don't forget to first start the network restarting command | Don't forget to first start the network restarting command, but this time, we need to run ssh from the kavlan gateway, and not the frontend : | ||
{{Term|location=frontend|cmd=<code class='command'> | {{Term|location=frontend|cmd=<code class='command'>ssh</code> kavlan-<code class="replace">VLANID</code>}} | ||
{{Term|location=kavlan-VLANID|cmd=for NODE in $(cat mynodes-vlan); do <code class='command'>ssh</code> root@$NODE "echo '/etc/init.d/networking restart' | at now + 1 minute "; done;}} | |||
Then you can put your nodes back in the default VLAN: | Then you can put your nodes back in the default VLAN: | ||
Line 282: | Line 419: | ||
You should be able to ping your nodes: | You should be able to ping your nodes: | ||
< | {{Term|location=frontend|cmd=<code class="command">for</code> i in `uniq $OAR_NODEFILE`; do ping -c 1 $i; done}} | ||
for i in `uniq $OAR_NODEFILE`; do ping -c 1 $i; done | |||
Another way to put back nodes into the default VLAN is to change the vlan and then kareboot the nodes. | |||
{{Term|location=frontend|cmd=<code class='command'>kavlan</code> -s -i DEFAULT -f $OAR_NODEFILE}} | |||
{{Term|location=frontend|cmd=<code class='command'>kareboot3</code> -f $OAR_NODEFILE -r simple}} | |||
= KaVLAN VPN = | |||
[[Image:G5K_kavlanvpn.png|400px|thumb]] | |||
This feature allows users to build a Virtual Private Network (VPN) between a [[KaVLAN]] network and the outside world. Hence, it is possible to interconnect Grid'5000 nodes with any external network (from the user's laptop to the Internet), bypassing Grid'5000 network isolation. | |||
{{Warning|text=This is an advanced feature. It requires a good understanding of [[KaVLAN]], VPNs and networking in Linux}} | |||
{ | Some information: | ||
* This service currently uses SSH VPN at Layer-2 (Ethernet level). Since that VPN is built on top of TCP protocol, '''you should not expect high network performance'''. | |||
* The VPN requires two end points (or gateway) to be interconnected. On Grid'5000 side, VPN gateways are installed on kavlan-{1,2,3}.<site>.grid5000.fr servers. | |||
* On the user's side (outside of Grid'5000), a GNU/Linux system with root privileges is required, to act as the user's gateway. | |||
* On Grid'5000, the user must reserve a '''non-routed local''' kavlan network (the VPN only works with this kind of kavlan). Grid'5000 nodes must be switched into that kavlan to be accessible through the VPN. | |||
The VPN is initiated from the user's gateway machine using a SSH connection to the appropriate kavlan-X server (which depends on kavlan network previously reserved). To enable VPN, SSH "-w" options must be used to connect to remote tap0 on kavlan-X server, with VPN tunnel configured in Ethernet mode. See ssh and ssh_config manpages for more information about those options. | |||
Example, with KaVLAN network "1" at lyon : | |||
* As root, create a virtual tap device that will be connected to your kavlan using SSH VPN. (Replace $USERNAME by your user name) | |||
<code class="host">laptop: </code>sudo ip tuntap add dev tap<code class="replace">0</code> mode tap user <code class="replace">$USERNAME</code> | |||
* Assign an IP address to this interface. | |||
<code class="host">laptop: </code>sudo ifconfig tap<code class="replace">0</code> 192.168.207.253/20 | |||
{{Warning|text=The IP address you choose must be inside the kavlan network, which depends on the kavlan number you are using. See [[Grid5000:Network#KaVLAN_networks]]}} | |||
* Start the SSH VPN | |||
<code class="host">laptop: </code>ssh -o Tunnel=ethernet -w <code class="replace">0</code>:0 -N kavlan-1.lyon.g5k | |||
If the command runs correctly, it should not output anything. | |||
* Options description: | |||
** ''-o Tunnel=ethernet'': Use an ethernet (layer 2) VPN | |||
** ''-w <code class="replace">0</code>:0'': Use interface tap<code class="replace">0</code> on client side (first <code class="replace">0</code>) and tap0 on server side (second 0, mandatory) | |||
** ''-N'' : Do not execute a remote command | |||
** ''kavlan-1.lyon.g5k'': Connect to lyon kavlan-1 gateway. Trailing .g5k assumes that you appropriately configured your ssh_config to connect to Grid5000 nodes using .g5k extension | |||
* Client's tap<code class="replace">0</code> interface is now connected to the kavlan network. You should be able to ping other nodes inside this network. | |||
<code class="host">laptop: </code>ping 192.168.192.83 | |||
PING 192.168.192.83 (192.168.192.83) 56(84) bytes of data. | |||
64 bytes from 192.168.192.83: icmp_req=1 ttl=64 time=82.7 ms | |||
64 bytes from 192.168.192.83: icmp_req=2 ttl=64 time=39.9 ms | |||
... | |||
{{Warning|text=DNS hostname resolution cannot be used here, as DNS servers are inside Grid'5000 network and this command is executed from your local workstation}} | |||
=Other usage= | |||
==Using the API== | |||
Kavlan is also available through the API. Using the job and deploy API, you can, as with the command line tools, reverve nodes with vlan and deploy nodes into a vlan. | |||
See [[API tutorial#Vlans_API|Vlans API tutorial]] and [https://api.grid5000.fr/doc/stable/#tag/vlan Vlans API speficiation] | |||
== Use a global VLAN == | == Use a global VLAN == | ||
Line 343: | Line 485: | ||
First reserve a global vlan on one site (here sophia) and 2 nodes on lille,sophia and lyon: | First reserve a global vlan on one site (here sophia) and 2 nodes on lille,sophia and lyon: | ||
{{Term|location=frontend|cmd=<code class="command"> | {{Term|location=frontend|cmd=<code class="command">funk</code> -y -k -w 02:00:00 -r <code class="replace">sophia</code>:2,<code class="replace">lille</code>:2,<code class="replace">lyon</code>:2 -o "-t deploy" --no-oargrid > funk.out }} | ||
Get the | Get the node list using this script: | ||
{{Term|location=frontend|cmd=<code class="command"> | {{Term|location=frontend|cmd=( for s in <code class="replace">sophia</code> <code class="replace">lille</code> <code class="replace">lyon</code>;<br/> do<br/><code class="command">ssh</code> $s "<code class="command">oarstat</code> -J -u" | <code class="command">jq</code> '[.[] | select(.state == "Running") | .assigned_network_address[]]';<br/>done;<br/>) | <code class="command">jq</code> -s 'flatten | .[]' | <code class="command">tr</code> -d \" > ~/gridnodes<br>}} | ||
Then use kadeploy3 to deploy your image on all sites and change the VLAN: | Then use kadeploy3 to deploy your image on all sites and change the VLAN: | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f gridnodes -a http://public.sophia.grid5000.fr/~nniclausse/openmx.dsc -k --multi-server -o ~/nodes.deployed --vlan <code class="replace"> | {{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f gridnodes -a http://public.sophia.grid5000.fr/~nniclausse/openmx.dsc -k --multi-server -o ~/nodes.deployed --vlan <code class="replace">18</code>}} | ||
If you want to manipulate directly VLAN of a node, you have to run the kavlan command on the site where the node is, e.g. if you have reserved the global vlan located at sophia and want to put some nodes of lille into this vlan, you have to run kavlan on lille site (or use the API with lille site in the URL). | If you want to manipulate directly VLAN of a node, you have to run the kavlan command on the site where the node is, e.g. if you have reserved the global vlan located at sophia and want to put some nodes of lille into this vlan, you have to run ''kavlan -m nodename -i VLAN_GLOBAL_ID -s'' on lille site (or use the API with lille site in the URL). | ||
== How to use a local VLAN == | == How to use a local VLAN == | ||
Line 363: | Line 501: | ||
If you want to use local VLAN, you have to first connect on the gateway of the vlan. For this, once you have a running reservation on a local VLAN, you have a ssh accces to the gateway: | If you want to use local VLAN, you have to first connect on the gateway of the vlan. For this, once you have a running reservation on a local VLAN, you have a ssh accces to the gateway: | ||
{{Term|location=frontend|cmd=<code class="command">ssh</code> kavlan-<vlanid>}} | |||
Then you can reach your nodes inside the VLAN. Another option is to use the kaconsole command. | Then you can reach your nodes inside the VLAN. Another option is to use the kaconsole command. | ||
Line 387: | Line 525: | ||
Then you can simply use ssh <cluster>-<nodeid>-kavlan-<vlanid> to access the node , for example: | Then you can simply use ssh <cluster>-<nodeid>-kavlan-<vlanid> to access the node , for example: | ||
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class='replace'>NODE</code>-kavlan-<code class='replace'>VLANID</code>}} | {{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class='replace'>NODE</code>-kavlan-<code class='replace'>VLANID</code>}} | ||
== A simple multi NICs example == | |||
We show here how to reserve and configure multiple Ethernet network interfaces. | |||
First we reserve a deploy job, with 2 nodes and 2 vlans: | |||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> oarsub -I -t deploy -l {"eth_count > 1 and cluster = '<code class="replace">cluster_name</code>'"}/nodes=2,{"type='kavlan'"}/vlan=2,walltime=02:00:00}} | |||
Then we deploy the wanted environment: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -f $OAR_NODEFILE -k -e debian11-x64-nfs}} | |||
See cluster section to know which Ethernet interfaces can be used. For exemple, on paranoia (Rennes), eth1/enp3s0f1 and eth2/eno1 are cabled. Use <code class="replace">I</code>=1 and <code class="replace">J</code>=2 for paranoia. | |||
Get node name with interfaces: | |||
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_FILE_NODES | sed -e 's/\([^\.]*\)\(.*\)/\1-eth<code class="replace">I</code>\2/' > nodes_second_int}} | |||
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_FILE_NODES | sed -e 's/\([^\.]*\)\(.*\)/\1-eth<code class="replace">J</code>\2/' > nodes_third_int}} | |||
Show vlans number: | |||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -V }} | |||
Put interfaces on the two different vlan: | |||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -i <code class="replace">vlan1</code> -s -f nodes_second_int}} | |||
{{Term|location=frontend|cmd=<code class="command">kavlan</code> -i <code class="replace">vlan2</code> -s -f nodes_third_int}} | |||
Get ip on second and third interface : | |||
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_NODEFILE | taktuk -d -1 -l root -f - broadcast exec [ 'dhclient <code class="replace">enp3s0f1</code>' ]}} | |||
{{Term|location=frontend|cmd=<code class="command">uniq</code> $OAR_NODEFILE | taktuk -d -1 -l root -f - broadcast exec [ 'dhclient <code class="replace">eno1</code>' ]}} | |||
At this moment your node should have 3 IP: | |||
{{Term|location=node|cmd=<code class="command">ip</code> a}} | |||
root@paranoia-8:~# ip a | |||
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1 | |||
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 | |||
inet 127.0.0.1/8 scope host lo | |||
valid_lft forever preferred_lft forever | |||
inet6 ::1/128 scope host | |||
valid_lft forever preferred_lft forever | |||
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 | |||
link/ether f0:4d:a2:73:ce:3d brd ff:ff:ff:ff:ff:ff | |||
inet <code class="replace">10.24.70.8/18</code> brd 10.24.127.255 scope global eno1 | |||
valid_lft forever preferred_lft forever | |||
inet6 fe80::f24d:a2ff:fe73:ce3d/64 scope link | |||
valid_lft forever preferred_lft forever | |||
3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 | |||
link/ether f0:4d:a2:73:ce:3e brd ff:ff:ff:ff:ff:ff | |||
4: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 | |||
link/ether a0:36:9f:28:a9:18 brd ff:ff:ff:ff:ff:ff | |||
inet <code class="replace">172.16.100.8/20</code> brd 172.16.111.255 scope global enp3s0f0 | |||
valid_lft forever preferred_lft forever | |||
inet6 fe80::a236:9fff:fe28:a918/64 scope link | |||
valid_lft forever preferred_lft forever | |||
5: enp3s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 | |||
link/ether a0:36:9f:28:a9:1a brd ff:ff:ff:ff:ff:ff | |||
inet <code class="replace">10.24.7.8/18</code> brd 10.24.63.255 scope global enp3s0f1 | |||
valid_lft forever preferred_lft forever | |||
inet6 fe80::a236:9fff:fe28:a91a/64 scope link | |||
valid_lft forever preferred_lft forever |
Latest revision as of 15:20, 5 August 2024
Note | |
---|---|
This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team. |
Overview
The goal of Kavlan is to provide network isolation for Grid'5000 users. KaVLAN allow users to manage VLAN on their Grid'5000 nodes. The benefits is complete level 2 isolation. It can be used together with OAR and Kadeploy to do some experimentations on the platform.
The first step is to read the KaVLAN introduction to understand what kind of VLANs you can configure.
If you want a more concrete example of what you can do with VLANs on Grid'5000, you can go through the Network_reconfiguration_tutorial.
Reserve VLANs and deploy nodes inside
In the first part of the tutorial, we will use kadeploy and kavlan together on a single site, with a routed vlan (we could also use a local vlan).
Once connected on a frontend, in order to obtain nodes and a VLAN you must reserve a kavlan resource with oarsub
. There are 3 kinds of resources: kavlan, kavlan-local, kavlan-global. Here, we will use 3 nodes and a routed VLAN, let's say in Sophia on cluster Suno:
A shell is now opened on the frontend (like any regular deploy job)
You can get the id of your VLAN using the kavlan
command:
If you run this command outside the shell started by OAR for your reservation, you must add the oar JOBID.
You should get an integer in the <4-9> range for this routed VLAN (the range for local vlan is <1-3>, and there is one global VLAN per OAR server, i.e. one per site).
For our example, let's say we got suno-2, suno-30 and suno-31, and kavlan #4.
You can get all the options of the command using --help:
# kavlan --help Usage: kavlan [options] Specific options: -i, --vlan-id N set VLAN ID (integer or DEFAULT) -C, --ca-cert CA CA certificate -c, --client-cert CERT client certificate -k, --client-key KEY client key -l, --get-nodelist Show nodenames in the given vlan -e, --enable-dhcp Start DHCP server -d, --disable-dhcp Stop DHCP server -V, --show-vlan-id Show vlan id of job (needs -j JOBID) -g, --get-vlan Show vlan of nodes -s, --set-vlan Set vlan of nodes -j, --oar-jobid JOBID OAR job id -m, --machine NODE set nodename (several -m are OK) -f, --filename NODEFILE read nodes from a file -u, --user USERNAME username -v, --[no-]verbose Run verbosely -q, --[no-]quiet Run quietly --[no-]debug Run with debug output -h, --help Show this message --version Show version
Once you have a kavlan reservation running, you can put your nodes in your VLAN (and back into the default VLAN) at anytime during the lifetime of your job; we will not use this for now.
Instead we will change the VLAN with kadeploy directly. The next step is to deploy the nodes with an environment image, for instance debian11-x64-big.
Enable the dhcp server of the VLAN
Before deploying, if you don't install your own DHCP server, you should start the default DHCP server of the VLAN. Do this with the kavlan command (add -j JOBID if needed) :
(You can disable the DHCP server with kavlan -d
)
Deploy nodes and change VLAN in one step
Once the deployment is done, you will be able to connect on your nodes. They are now inside the VLAN, therefore they are not reachable with their default IP:
PING suno-30.sophia.grid5000.fr (172.16.130.30) 56(84) bytes of data. From fsophia.sophia.grid5000.fr (172.16.143.106) icmp_seq=1 Destination Host Unreachable --- suno-30.sophia.grid5000.fr ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
You can get the list of new hostnames of your nodes in the VLAN with kavlan -l
. For the next part of this tutorial, let's create a nodefile and copy it on the first node:
suno-2-kavlan-4.sophia.grid5000.fr suno-30-kavlan-4.sophia.grid5000.fr suno-31-kavlan-4.sophia.grid5000.fr
The password for user root on Grid'5000 environments is "grid5000".
You can see that you can ping these new hostnames:
PING suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30) 56(84) bytes of data. 64 bytes from suno-30-eth0-kavlan-4.sophia.grid5000.fr (10.32.3.30): icmp_seq=1 ttl=63 time=0.151 ms --- suno-30-eth0-kavlan-4.sophia.grid5000.fr ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.151/0.151/0.151/0.000 ms
Setup a DHCP server on your nodes
Configure DHCP
If you need to run your own DHCP server (for example if you want to run a cluster distribution inside kavlan or test kadeploy ), you can use a script to generate the configuration file:
Then, go back the the frontend, and download the script that will generate your dhcp configuration:
Create this file (gen_dhcpd_conf.rb) on the frontend :
#!/usr/bin/ruby # Author: Nicolas Niclausse # Copyright 2010-2011: INRIA # script specific to grid5000: # generate dhcpd config files for kavlan require 'rubygems' require 'restfully' # gem install restfully --source http://gemcutter.org require 'ip' # gem install ruby-ip require 'getoptlong' require 'optparse' require 'ostruct' headers = "ddns-update-style none; option space pxelinux; option pxelinux.magic code 208 = string; option pxelinux.configfile code 209 = text; option pxelinux.pathprefix code 210 = text; option pxelinux.reboottime code 211 = unsigned integer 32; option vendorinfo code 43 = string; " conf = File.expand_path('~/.restfully/api.grid5000.fr.yaml') options = if FileTest.exists?(conf) then YAML.load_file(conf) else {} end options[:base_uri] = 'https://api.grid5000.fr/stable/grid5000' def parseopts(args) options = OpenStruct.new options.debug = false options.verbose = false options.quiet = false options.nodes = [] opts = OptionParser.new do |opts| opts.banner = "Usage: gen_dhcpd_conf.rb [options]" opts.separator "" opts.separator "Specific options:" opts.on("-s","--site SITE", "generate only DHCP conf for site SITE") do |site| options.site = site end opts.on("-i","--vlan-id N", Integer , "generate only DHCP conf for vlan N") do |vlan| options.vlan = vlan end opts.on("-q", "--[no-]quiet", "Run quietly") do |q| options.quiet = q end opts.on("-v", "--[no-]verbose", "Run verbosely") do |v| options.verbose = v end opts.on_tail("-h", "--help", "Show this message") do puts opts exit end end opts.parse!(args) options end $opts = parseopts(ARGV) Restfully::Session.new(options) do |root, session| options = {:query => {:version => root['version']}} root.sites(options).each do |site| mysite=site['uid'] next if not $opts.site.nil? and mysite != $opts.site # optionaly, read mac address from external yaml file ref = if FileTest.exists?(mysite+".yaml") then YAML.load_file(mysite+".yaml") else puts mysite +": no yaml file for macs" unless $opts.quiet {} end if $opts.vlan.nil? then vlans = (1..9).to_a # try to guess global vlan assigned to current site (10..21).each do |gvlan| begin IPSocket::getaddress("gw-kavlan-"+gvlan.to_s+"."+mysite+".grid5000.fr") puts "global vlan found for site %s: " % mysite unless $opts.quiet vlans.push(gvlan) rescue next end end else vlans = [$opts.vlan] end vlans.each do |vlan| filename = "dhcpd-kavlan-"+vlan.to_s+"-"+mysite+".conf" open(filename, 'w') do |f| puts "generating "+filename unless $opts.quiet f.puts headers begin gateway = IPSocket::getaddress("gw-kavlan-"+vlan.to_s+"."+mysite+".grid5000.fr") rescue puts "WARN: Get address error: probably no kavlan DNS setup for site " + mysite + " , skip" if $opts.verbose; next end # /20 for local vlans (1..3) and /18 for routed vlan (4..9) if vlan < 4 ip = IP.new(gateway+"/20") ns = gateway ntp = gateway tftp = gateway else ip = IP.new(gateway+"/18") ntp = IPSocket::getaddress("ntp."+mysite+".grid5000.fr") ns = IPSocket::getaddress("dns."+mysite+".grid5000.fr") tftp = IPSocket::getaddress("kadeploy-server."+mysite+".grid5000.fr") end netmask = ip.netmask.to_addr broadcast = ip.broadcast.to_addr network = ip.network.to_addr f.puts "subnet %s netmask %s {" % [network , netmask] f.puts " default-lease-time 86400; max-lease-time 604800;" #f.puts " option domain-name \"%s.grid5000.fr\"; " % mysite f.puts " option domain-name-servers %s;" % ns f.puts " option ntp-servers %s; " % ntp f.puts " option routers %s;" % gateway f.puts " option subnet-mask %s; " % netmask f.puts " option broadcast-address %s;" % broadcast f.puts " filename \"pxelinux.0\";" f.puts " next-server %s;" % tftp sites_for_vlan = if vlan < 10 [ site ] else root.sites(options) end sites_for_vlan.each do |currentsite| currentsite.clusters(options).each do |cluster| cluster.nodes(options).each do |node| sitename=currentsite['uid'] device = node['network_adapters'].find{|s| s['network_address'] =~ /^\w+-\d+\.\w+\.grid5000\.fr/} next if device.nil? hostname = device['network_address'] next if hostname.nil? hostname_vlan = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1+"-kavlan-"+vlan.to_s+$2} shortname_vlan = hostname_vlan.gsub(/^(\w+-\d+-\w+-\d+)(\..*)$/){$1} shortname = hostname.gsub(/^(\w+-\d+)(\..*)$/){$1} realsite = hostname.split(".")[1] begin vlan_ip = IPSocket::getaddress(hostname_vlan) rescue puts "WARN: Get address error: probably no DNS setup for vlan " +vlan.to_s+" on site " + sitename + " , skip" if $opts.verbose; next end if device['mac'].nil? then if ref[shortname].nil? then puts "WARN: mac undefined for host %s, skip" % hostname unless $opts.quiet next else mac = ref[shortname]['mac_eth0'] end else mac = device['mac'] end f.puts " host %s {" % hostname_vlan f.puts " hardware ethernet %s; " % mac f.puts " option host-name \"%s\";" % shortname_vlan f.puts " option domain-name \"%s\.grid5000.fr\";" % realsite f.puts " fixed-address %s;" % vlan_ip if vlan > 9 # for global vlan, we need the local tftp server currenttftp = IPSocket::getaddress("kadeploy-server."+sitename+".grid5000.fr") f.puts " next-server %s;" % currenttftp end f.puts " }" end end end f.puts "}" end end end end
(this script use restfully
and ruby-ip
gems)
Then, generate the configuration (replace VLANID
and SITE
by your current site and VLAN id), and copy it on the node:
For user accounts, you need to specify your GEM_HOME directory because in the classical one, you won't be able to install the "ruby-ip". To make it possible, type :
You have to disable the default DHCP server of the VLAN:
On the frontend
Now you have to install a DHCP server on the node (we assume the node is not yet in the job VLAN, or the vlan is routed and have access to internet for apt):
There may be an error after the installation : It's normal, you need to tell the DHCP server on which interface to listen to DHCP requests (replace "eno1" with the name of the interface on which the server should listen)
You can now copy the generated configuration file and start the DHCP server :
Then, in another shell, connect as root on a second node (or use kaconsole):
And restart the network configuration:
... Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67 Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPDISCOVER on eno1 to 255.255.255.255 port 67 interval 10 Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPREQUEST of 10.32.3.7 on eno1 to 255.255.255.255 port 67 Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPOFFER of 10.32.3.7 from 10.32.3.6 Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: DHCPACK of 10.32.3.7 from 10.32.3.6 Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPOFFER of 10.32.3.7 from 10.32.3.6 Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr dhclient[2633]: DHCPACK of 10.32.3.7 from 10.32.3.6 Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: bound to 10.32.3.7 -- renewal in 34620 seconds. Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr ifup[2618]: Sending network state change signal to nslcd...done. Apr 12 17:05:34 suno-7-kavlan-4.sophia.grid5000.fr systemd[1]: Started Raise network interfaces.
on the dhcp server, check the logs:
... Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPDISCOVER from 00:26:b9:3f:43:a1 via eno1 Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPOFFER on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1 Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPREQUEST for 10.32.3.7 (10.32.3.6) from 00:26:b9:3f:43:a1 via eno1 Apr 12 17:32:24 suno-6-kavlan-4 dhcpd: DHCPACK on 10.32.3.7 to 00:26:b9:3f:43:a1 via eno1
In the four last lines, you see that your own dhcp server has given an address to the other node.
DHCP and PXE
For your information, if you need to do a PXE boot, you must change the tftp server in the generated dhcpd configuration file:
(if there is no next-server
configured, you must edit the file by hand and add a line like this:
next-server XX.XX.XX.XX ;
where XX.XX.XX.XX is the IP of your node (echo $IP).
Change the VLAN of your nodes manually
Put your nodes into the reserved VLAN
If you really want to change the VLAN manually, you can, but it's much simpler to change the vlan with kadeploy.
In order to change the VLAN of the nodes manually, you must reconfigure the network after the vlan has changed; but once the VLAN has changed, you can't connect to the node! An easy way to do this is to use the 'at' command (apt-get install at
if it's not installed in your nodes)
We will use Taktuk to start remote commands on several nodes at once. In this example, we will use all the nodes. Since taktuk does not handle duplicate names in the nodefile, we must first remove duplicates.
First, we will use taktuk to install at
on all nodes, then the taktuk command will simply launch the network reconfiguration in one minute. Finally, we set the VLAN of all our nodes.
As we will change the network configuration of nodes, we will use an isolated kavlan (a.k.a. kavlan-local) to not interfer with the rest of Grid'5000 network.
frontend :
|
taktuk -s -l root -f $OAR_FILE_NODES broadcast exec [ "apt-get update; apt-get --yes install at" ] |
frontend :
|
taktuk -s -l root -f $OAR_FILE_NODES broadcast exec [ "echo '/etc/init.d/networking restart'| at now + 1 minute " ] |
All nodes are configured in the vlan 2. In one minute, your nodes will renegotiate their IP addresses and will be available inside the VLAN, you can connect to each of them using kaconsole or ssh (as we use a kavlan-local, you must connect to the gateway of that kavlan first):
You can use the ip neigh
command to see the known hosts in your LAN; you should only see IPs in the 192.168.66.0/24 subnet
192.168.223.254 dev eno1 lladdr 02:00:00:00:01:02 REACHABLE
You should be able to ping another of your host inside your VLAN
64 bytes from 192.168.211.42: icmp_req=1 ttl=64 time=0.141 ms 64 bytes from 192.168.211.42: icmp_req=2 ttl=64 time=0.166 ms 64 bytes from 192.168.211.42: icmp_req=3 ttl=64 time=0.165 ms --- suno-42-kavlan-2.sophia.grid5000.fr ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 1998ms rtt min/avg/max/mdev = 0.141/0.157/0.166/0.015 ms
Put your nodes back into the default VLAN
First, get put the list of your nodes name with vlan in a file:
Don't forget to first start the network restarting command, but this time, we need to run ssh from the kavlan gateway, and not the frontend :
kavlan-VLANID :
|
for NODE in $(cat mynodes-vlan); do
ssh root@$NODE "echo '/etc/init.d/networking restart' | at now + 1 minute "; done; |
Then you can put your nodes back in the default VLAN:
You should be able to ping your nodes:
Another way to put back nodes into the default VLAN is to change the vlan and then kareboot the nodes.
KaVLAN VPN
This feature allows users to build a Virtual Private Network (VPN) between a KaVLAN network and the outside world. Hence, it is possible to interconnect Grid'5000 nodes with any external network (from the user's laptop to the Internet), bypassing Grid'5000 network isolation.
Warning | |
---|---|
This is an advanced feature. It requires a good understanding of KaVLAN, VPNs and networking in Linux |
Some information:
- This service currently uses SSH VPN at Layer-2 (Ethernet level). Since that VPN is built on top of TCP protocol, you should not expect high network performance.
- The VPN requires two end points (or gateway) to be interconnected. On Grid'5000 side, VPN gateways are installed on kavlan-{1,2,3}.<site>.grid5000.fr servers.
- On the user's side (outside of Grid'5000), a GNU/Linux system with root privileges is required, to act as the user's gateway.
- On Grid'5000, the user must reserve a non-routed local kavlan network (the VPN only works with this kind of kavlan). Grid'5000 nodes must be switched into that kavlan to be accessible through the VPN.
The VPN is initiated from the user's gateway machine using a SSH connection to the appropriate kavlan-X server (which depends on kavlan network previously reserved). To enable VPN, SSH "-w" options must be used to connect to remote tap0 on kavlan-X server, with VPN tunnel configured in Ethernet mode. See ssh and ssh_config manpages for more information about those options.
Example, with KaVLAN network "1" at lyon :
- As root, create a virtual tap device that will be connected to your kavlan using SSH VPN. (Replace $USERNAME by your user name)
laptop:
sudo ip tuntap add dev tap0
mode tap user$USERNAME
- Assign an IP address to this interface.
laptop:
sudo ifconfig tap0
192.168.207.253/20
Warning | |
---|---|
The IP address you choose must be inside the kavlan network, which depends on the kavlan number you are using. See Grid5000:Network#KaVLAN_networks |
- Start the SSH VPN
laptop:
ssh -o Tunnel=ethernet -w0
:0 -N kavlan-1.lyon.g5k
If the command runs correctly, it should not output anything.
- Options description:
- -o Tunnel=ethernet: Use an ethernet (layer 2) VPN
- -w
0
:0: Use interface tap0
on client side (first0
) and tap0 on server side (second 0, mandatory) - -N : Do not execute a remote command
- kavlan-1.lyon.g5k: Connect to lyon kavlan-1 gateway. Trailing .g5k assumes that you appropriately configured your ssh_config to connect to Grid5000 nodes using .g5k extension
- Client's tap
0
interface is now connected to the kavlan network. You should be able to ping other nodes inside this network.
laptop:
ping 192.168.192.83
PING 192.168.192.83 (192.168.192.83) 56(84) bytes of data.
64 bytes from 192.168.192.83: icmp_req=1 ttl=64 time=82.7 ms
64 bytes from 192.168.192.83: icmp_req=2 ttl=64 time=39.9 ms
...
Warning | |
---|---|
DNS hostname resolution cannot be used here, as DNS servers are inside Grid'5000 network and this command is executed from your local workstation |
Other usage
Using the API
Kavlan is also available through the API. Using the job and deploy API, you can, as with the command line tools, reverve nodes with vlan and deploy nodes into a vlan.
See Vlans API tutorial and Vlans API speficiation
Use a global VLAN
With a global VLAN, you can put nodes from several sites in the same VLAN
First reserve a global vlan on one site (here sophia) and 2 nodes on lille,sophia and lyon:
Get the node list using this script:
frontend :
|
( for s in
sophia lille lyon ;do ssh $s "oarstat -J -u" | jq '[.[] | select(.state == "Running") | .assigned_network_address[]]';done; ) | jq -s 'flatten | .[]' | tr -d \" > ~/gridnodes |
Then use kadeploy3 to deploy your image on all sites and change the VLAN:
frontend :
|
kadeploy3 -f gridnodes -a http://public.sophia.grid5000.fr/~nniclausse/openmx.dsc -k --multi-server -o ~/nodes.deployed --vlan 18 |
If you want to manipulate directly VLAN of a node, you have to run the kavlan command on the site where the node is, e.g. if you have reserved the global vlan located at sophia and want to put some nodes of lille into this vlan, you have to run kavlan -m nodename -i VLAN_GLOBAL_ID -s on lille site (or use the API with lille site in the URL).
How to use a local VLAN
In this section, we will describe the specificity of the local VLANs.
If you want to use local VLAN, you have to first connect on the gateway of the vlan. For this, once you have a running reservation on a local VLAN, you have a ssh accces to the gateway:
Then you can reach your nodes inside the VLAN. Another option is to use the kaconsole command.
(You can still use kadeploy to put your nodes in the VLAN in one step.)
Configure ssh to easily connect to nodes in a local VLAN
You can configure ssh to make the connection through the gateway transparent:
In order to transparently use ssh to acces to isolated nodes (local VLAN), you should add this to your .ssh/config file on the frontend:
Host *-*-kavlan-1 *-*-kavlan-1.*.grid5000.fr ProxyCommand ssh -a -x kavlan-1 nc -q 0 %h %p Host *-*-kavlan-2 *-*-kavlan-2.*.grid5000.fr ProxyCommand ssh -a -x kavlan-2 nc -q 0 %h %p Host *-*-kavlan-3 *-*-kavlan-3.*.grid5000.fr ProxyCommand ssh -a -x kavlan-3 nc -q 0 %h %p
Then you can simply use ssh <cluster>-<nodeid>-kavlan-<vlanid> to access the node , for example:
A simple multi NICs example
We show here how to reserve and configure multiple Ethernet network interfaces.
First we reserve a deploy job, with 2 nodes and 2 vlans:
frontend :
|
oarsub oarsub -I -t deploy -l {"eth_count > 1 and cluster = 'cluster_name '"}/nodes=2,{"type='kavlan'"}/vlan=2,walltime=02:00:00 |
Then we deploy the wanted environment:
See cluster section to know which Ethernet interfaces can be used. For exemple, on paranoia (Rennes), eth1/enp3s0f1 and eth2/eno1 are cabled. Use I
=1 and J
=2 for paranoia.
Get node name with interfaces:
Show vlans number:
Put interfaces on the two different vlan:
Get ip on second and third interface :
At this moment your node should have 3 IP:
root@paranoia-8:~# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether f0:4d:a2:73:ce:3d brd ff:ff:ff:ff:ff:ff inet10.24.70.8/18
brd 10.24.127.255 scope global eno1 valid_lft forever preferred_lft forever inet6 fe80::f24d:a2ff:fe73:ce3d/64 scope link valid_lft forever preferred_lft forever 3: eno2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether f0:4d:a2:73:ce:3e brd ff:ff:ff:ff:ff:ff 4: enp3s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether a0:36:9f:28:a9:18 brd ff:ff:ff:ff:ff:ff inet172.16.100.8/20
brd 172.16.111.255 scope global enp3s0f0 valid_lft forever preferred_lft forever inet6 fe80::a236:9fff:fe28:a918/64 scope link valid_lft forever preferred_lft forever 5: enp3s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether a0:36:9f:28:a9:1a brd ff:ff:ff:ff:ff:ff inet10.24.7.8/18
brd 10.24.63.255 scope global enp3s0f1 valid_lft forever preferred_lft forever inet6 fe80::a236:9fff:fe28:a91a/64 scope link valid_lft forever preferred_lft forever