Advanced Kadeploy: Difference between revisions
Lsarzyniec (talk | contribs) |
|||
(239 intermediate revisions by 25 users not shown) | |||
Line 1: | Line 1: | ||
{{Portal|User}} | {{Portal|User}} | ||
{{Portal|Tutorial}} | {{Portal|Tutorial}} | ||
{{TutorialHeader}} | |||
{{Warning|text=Please see the [[Environment creation]] guide, which gives automated mechanisms to build kadeploy environnements}} | |||
= What you need to know before starting = | = What you need to know before starting = | ||
The first thing to understand is that by using kadeploy3, you will be running a command that attempts to remotely reboot nodes, and boot them using configuration files hosted on a server | The first thing to understand is that by using kadeploy3, you will be running a command that attempts to remotely reboot possibly many nodes at the same time, and boot them over the network using configuration files hosted on a server. | ||
== What is an Environment ? == | == What is an Environment? == | ||
''Where we describe what exactly is <code class="env">image</code>, <code class="file">kernel</code>, <code class="file">initrd</code> and <code class="file">postinstall</code>'' | ''Where we describe what exactly is <code class="env">image</code>, <code class="file">kernel</code>, <code class="file">initrd</code> and <code class="file">postinstall</code>'' | ||
Line 10: | Line 14: | ||
file describing a fully functional Operating System. | file describing a fully functional Operating System. | ||
To be able to setup a Operating System, <code class="command">kadeploy3</code> | To be able to setup a Operating System, <code class="command">kadeploy3</code> | ||
needs at least 4 files in the most common cases | needs at least 4 files in the most common cases | ||
# An '''image''' | # An '''image''' | ||
#* An image is a file containing all the Operating System files. It | #* An image is a file containing all the Operating System files. It is a compressed archive (compressed tarball). | ||
# A '''kernel file''' | # A '''kernel file''' | ||
#* | #* Specifies the kernel file to boot. The full file path in the target system is expected. | ||
# '''initrd''' | # An '''initrd file''' (optional) | ||
#* For the Linux | #* For the Linux systems, the <code class="file">initrd</code> or <code class="file">initramfs</code> file is the initial ramdisk used by the kernel as a preliminary system before the root filesytem is mounted. More information: [http://en.wikipedia.org/wiki/Initrd Initrd on Wikipedia]. The full file path in the target system is expected. | ||
# A '''postinstall''' file (optional) | # A '''postinstall''' file (optional) | ||
#*The postinstall file allows | #*The postinstall file allows for adapting the deployed environment to specificities of each site and cluster, as well as setting up some platform-specific services. | ||
Once you have this set of files, you can describe your environment to <code class="command">kadeploy3</code>. This description represents an <code class="env">environment</code> in the <code class="command">kadeploy3</code> sense. | Once you have this set of files, you can describe your environment to <code class="command">kadeploy3</code>. This description represents an <code class="env">environment</code> in the <code class="command">kadeploy3</code> sense. | ||
== How can I make my own environment ? == | Note: the kadeploy software also supports deploying environments made of binary disk images (''dd'' format) but this is not supported on Grid'5000. | ||
To create our own environment | |||
== How can I make my own environment? == | |||
To create our own environment, there are two main ways: | |||
* One way is to deploy an existing environment, customize it and save it with <code class="command">tgz-g5k</code>, but that is not really a good ''reproducible'' way of working. | |||
* The other way, which favors reproducibility, is to build the environment from a recipe, using <code class="command">kameleon</code>, just like the Grid'5000 supported reference environments. See the [[Environment creation]] tutorial for details. | |||
== Disk partitioning == | |||
Environments are deployed on the (first) local disk of each node (each Grid'5000 node is equipped with at least one hard drive, HDD or SSD). The following partitioning scheme is used: | |||
{| | |||
|- | |||
!style="background: lightgrey;"| Label | |||
!style="background: lightgrey;"| Role | |||
!style="background: lightgrey;"| Size | |||
|- | |||
| <code>SWAP</code> | |||
| Linux swap | |||
| 4 GB | |||
|- | |||
| <code>PROD</code> | |||
| Standard environment (default environment installed on nodes) | |||
| 31 GB | |||
|- | |||
| <code>DEPLOY</code> | |||
| User's environment (when Kadeploy is used) | |||
| 34 GB | |||
|- | |||
| <code>EFI</code> | |||
| For UEFI systems: UEFI system partition (store boot loaders) | |||
| 1 GB | |||
|- | |||
| <code>TMP</code> | |||
| Made available in <code class="file">/tmp</code> | |||
| Remaining disk space | |||
|} | |||
; Notes | |||
* Some clusters have more than one local disk. On some clusters, those disks can be reserved apart from the node itself. See [[Disk reservation]] to find out how to use them. | |||
* Kadeploy allows deployments on other partitions than <code>DEPLOY</code> or even creating a custom partitioning scheme. See below for details. | |||
{{Warning|text=If your image is too big to be deployed on the <code>DEPLOY</code> partition (that is the uncompressed size of your image is bigger than 34 GB), you will need to [[Advanced_Kadeploy#Deploy_on_partition_labeled_PROD_or_TMP|deploy on the <code>TMP</code> partition]] or [[Advanced_Kadeploy#Use_a_custom_partitioning_scheme|to make a custom partitioning scheme]]. }} | |||
= Search and deploy an existing environment = | = Search and deploy an existing environment = | ||
Line 29: | Line 75: | ||
== Search an environment == | == Search an environment == | ||
Grid'5000 maintains several reference environments directly available | Grid'5000 maintains several reference environments directly available for deployment in all sites. These environments are based on the Debian, Ubuntu, or Centos Linux distribution. | ||
For Debian, different variants of reference environments are offered. For Ubuntu and Centos, only environments with a minimal system are offered. | |||
They are called ''reference'' environments because they can be used to generate customized environments and because they are provided and supported by the Grid'5000 technical team. | |||
The full list of reference environments is: | |||
{{:Generated/Environments}} | |||
Kadeploy provides a registry of environments in each site, where reference environments are registered along with environments of users. | |||
You can also list all available environment in a site by using the <code class="command">kaenv3</code> command : | For reference environments, associated filesystem images are stored in the <code class="dir">/grid5000</code> directory of the <code class="host">frontend</code>. | ||
To deploy a registered environment, you must know its name as registered in the Kadeploy registry. This tutorial uses the <code class="env">debian11-base</code> environment. | |||
You can also list all available environment in a site by using the <code class="command">kaenv3</code> command: | |||
{{term|location=frontend|cmd=<code class="command">kaenv3</code> <code>-l</code>}} | {{term|location=frontend|cmd=<code class="command">kaenv3</code> <code>-l</code>}} | ||
This command lists all public as well as your private environments. | This command lists all public as well as your private environments. | ||
We distinguish three levels of visibility for an environment : | We distinguish three levels of visibility for an environment: | ||
* ''public'': | * ''public'': Only administrators can register public environments. They are shown and used by default by the <code class=command>kaenv3</code> and <code class=command>kadeploy3</code> commands unless a user is specified (with <code class=command>-u</code> <code class=replace>user</code>). | ||
* ''shared'': | * ''shared'': Shared user environments. They are shown and used by the <code class=command>kaenv3</code> and <code class=command>kadeploy3</code> commands when a user is specified with <code class=command>-u</code> <code class=replace>user</code>. | ||
* ''private'': The environment is only | * ''private'': The environment is only shown and usable by the user the environment belongs to. | ||
For example, a shared environment added by user <code class="replace">user</code> is listed this way : | For example, a shared environment added by user <code class="replace">user</code> is listed this way: | ||
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -l -u <code class="replace">user</code>}} | {{Term|location=frontend|cmd=<code class="command">kaenv3</code> -l -u <code class="replace">user</code>}} | ||
Being able to reproduce | You can also look for a specific version with the <code class=command>--env-version</code> </code class=replace>version</code> option. Most of the versions of the reference environments images files are available in <code>/grid5000/images</code>. The version number is the last part of the image file. | ||
{{Term|location=frontend|cmd=<code class="command">kaenv3 | |||
For instance: <code class=file>debian11-min-2021092316.tar.zst</code> is the image file of the <code class=replace>debian11-min</code> reference environment version <code class=replace>2021092316</code>. | |||
Whenever you want to deploy this specific image on some nodes, use: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> <code class=replace>debian11-min</code> --env-version <code class=replace>2021092316</code>}} | |||
Being able to reproduce a past experiment is a desirable feature. Therefore, you should always try to control as much as possible the environment the experiment is done in. | |||
Therefore, we will attempt to check that the environment that was chosen in the environment directory is the one available on a given cluster. On the cluster you would like to deploy, type the following command to print information about an environment: | |||
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> <code class="env">debian11-base</code> <code>-u deploy</code>}} | |||
Adding <code>-u deploy</code> makes sure we get the description of the environment supported by the Grid'5000 staff, as ''deploy'' is the user owning them (one might have registred his own environment with the same, that would show up if no user is specified). | |||
In theory, you should also check the post-install script. A post-install script adapts an environment to the site it is deployed on. | |||
If everything seems ok, please proceed to the next step. | If everything seems ok, please proceed to the next step. | ||
== Make a job on a deployable node == | == Make a job on a deployable node == | ||
By default, Grid'5000 nodes are running on the ''production'' environment | By default, Grid'5000 nodes are running on the ''production'' environment, which already contains most of the important features and can be used to run experiments. This environment however imposes choices and has some limitations due to its general-purpose target. While you can gain the root privileges on it thanks to the <code class=command>sudo-g5k</code> command, you cannot reboot for instance, as this ends your reservation. Deploying on the contrary allows to have a dedicated environment and to have full control over the machine (reboot, serial console, ...). | ||
For this part of the tutorial, | For this part of the tutorial, jobs made will be interactive (<code>-I</code>), of the deploy type (<code>-t deploy</code>), on only one machine (<code>-l nodes=1</code>) to do environment customization (we will give ourselves 3 hours with <code>-l walltime=3</code>), which gives us the following command, that will open a new shell session on the frontend node: | ||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -l nodes=1,walltime=3}} | {{Term|location=frontend|cmd=<code class="command">oarsub</code> -I -t deploy -l nodes=1,walltime=3}} | ||
Indeed, when you submit a job of the deploy type, a new shell is opened on the frontend node and not on the first machine of the job as for standard jobs. When you exit from this shell, the job ends. The shell is populated with <code class="env">OAR_*</code> environment variables. You should look at the list of available variables to get an idea of the information you can use to script deployment later. As usual, if the job is successful, you will get the name of the machine allocated to your job with: | |||
{{Term|location=frontend|cmd=<code class="command"> | {{Term|location=frontend|cmd=<code class="command">cat</code> <code class="env">$OAR_FILE_NODES</code>}} | ||
{{Warning|text=At the end of a reservation with the <code>-t deploy</code> option, the reserved nodes will be reboot on the standard environment and thus make them available for another job possibly of another user. Rebooting may take up to 15 minutes, so please mind your commands as they have an implicit cost.}} | |||
== Deploy a reference environment == | |||
{{Term|location=frontend|cmd=<code class="command"> | By default, Kadeploy use all the nodes of the reservation. It is done by using the file whose name is given by the <code class="env">$OAR_FILE_NODES</code> environment variable (or <code class="env">$OAR_NODE_FILE</code>, or <code class="env">$OAR_NODEFILE</code>) | ||
So, to start the deployment on all the nodes, run the following command: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-base}} | |||
{{ | You can also just provide the nodes to deploy on, with the -m option: | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-base -m <code class="replace">node.site</code>.grid5000.fr}} | |||
You can | By default, Kadeploy will copy your <code class=file>~/.ssh/authorized_keys</code> and replace the <code class=file>/root/.ssh/authorized_keys</code> file on the deployed nodes. | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> - | Alternatively, you can use the <code>-k</code> option in two ways: | ||
* You can either specify the public key that will be copied in <code class=file>/root/.ssh/authorized_keys</code> on the deployed nodes: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-base -k ~/.ssh/my_special_key.pub}} | |||
* Or to not copy any key on the deployed node with <code>-k none</code>. You will need to provide a password to connect. However, SSH is often configured to disallow root login using password. The root password for all reference environments (i.e. provided by the Grid'5000 technical team) is <code>grid5000</code>. | |||
In our case, the node file contains only 1 node. | |||
Once the kadeploy command was executed successfully, the deployed node runs the <code>debian11-base</code> environment as their operating system. It will then be possible to tune this environment according to your needs. | |||
== Connect to the deployed environment and customize it == | == Connect to the deployed environment and customize it == | ||
;1. Connection | |||
On reference environments managed by the | |||
On reference environments managed by the Grid'5000 technical team, you can use the <code>root</code> account for log in with <code>ssh</code> (kadeploy checks that sshd is running before declaring a deployment successful). To connect to the node type: | |||
{{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">node.site</code>.grid5000.fr}} | {{Term|location=frontend|cmd=<code class="command">ssh</code> root@<code class="replace">node.site</code>.grid5000.fr}} | ||
In case this doesn't work, please take a look at the [[FAQ#Deployment related issues|kadeploy section]] of the [[FAQ|Sidebar > FAQ]] | In case this doesn't work, please take a look at the [[FAQ#Deployment related issues|kadeploy section]] of the [[FAQ|Sidebar > FAQ]] | ||
;2. Adding software to an environment | |||
You can alter your environment (to add missing libraries that you need, or remove packages that you don't need ; to reduce the size of the image and speed up the deployment process ; etc.) using commands such as: | |||
{{Term|location=node|cmd=<code class="command">apt-get</code> update <br> <code class="command">apt-get</code> upgrade <br> <code class="command">apt-get</code> install <code class="replace">list of desired packages and libraries</code> <br> <code class="command">apt-get</code> --purge remove <code class="replace">list of unwanted packages</code> <br> <code class="command">apt-get</code> clean}} | |||
== Create a new environment from a node's customized operating system == | |||
We now need to save this customized environment, where you have a user account, to be able to use this account again each time you deploy it.<br> | |||
The first step to create an environment is to create an archive of the node's operating system you just customized. You can use <code class=command>tgz-g5k</code> to extract a Grid'5000 environment tarball from a running node. It's usage is describe in the '''[[Environment creation]]''' tutorial page. | |||
= Advanced deployment options = | |||
{{Term|location=frontend|cmd=<code class="command"> | == Multisite deployment == | ||
In order to achieve a deployment on nodes from different sites, you can use the multiserver option of kadeploy, using the <code class="command">-M</code> option. | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> '''-M''' -f <code class="file">file_with_all_nodes</code> <code class="replace">debian11-big</code>}} | |||
== Deploy on other partition of disk(s) == | |||
Kadeploy default handling of partitions may be too limited for some usages. One may need to use disks differently. Kadeploy offers several options to deploy on another existing partition of the primary disk, or if required to repartition disks entirely and/or use several disks (on nodes with many disks). | |||
=== Deploy on partition labeled PROD or TMP === | |||
As seen earlier, kadeploy manages to label portions to identify them more easily, e.g. PROD, DEPLOY, or TMP. Kadeploy can be instructed to deploy on the PROD or TMP partition instead of the DEPLOY partition. | |||
Because this kind of deployment will break some node standard operations after the end of your job, you must add to your <code class=command>oarsub</code> command for the job creation the <code class="command">-t destructive</code> option. This will cause the node to be completely reinstalled after your job. | |||
{{Term|location= | {{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=1 -p hercule -I}} | ||
Then you can deploy on PROD or TMP with the <code class="command">-p PROD</code> or <code class="command">-p TMP</code> option: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-nfs -p <code class="replace">PROD</code>}} | |||
=== Deploy on secondary disks === | |||
Kadeploy identifies disks by their id (disk0, disk1...). You can find them in the Reference API or in the Hardware pages: | |||
{{Site link|Hardware}} | |||
Because this kind of deployment will break some node standard operations after the end of your job, you must add to your <code class=command>oarsub</code> command for the job creation the <code class="command">-t destructive</code> option. This will cause the node to be completely reinstalled after your job. | |||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=1 -p hercule -I}} | |||
=== | Then you can deploy on an secondary disk such as <code class="file">disk1</code> with the <code class="command">-b disk1</code> option: | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-min -b <code class="replace">disk1</code>}} | |||
{{Warning|text=It will not work for reservable disks}} | |||
Disks can also be handled differently by modifying the deployment automata, see below. | |||
== About the kernel and bootloader == | |||
As a reminder, a deployed environment may be booted either with '''kexec''' (e.g. at the end of the deployment) or with '''grub''' (e.g. after the deployment, calling <code class=command>kareboot</code>, or the <code class=command>reboot</code> command from the node itself, or whenever kexec is deactivated/not supported). | |||
By default, the bootloader installation step of kadeploy will install and configure grub using the grub commands provided by your environment. The grub configuration file will be generated with <code class=command>grub-mkconfig</code>, which will boot by default the most recent kernel version available in the <code class=file>/boot</code> directory of your environment. If that version does not match the kernel version provided in the environment description, an error will be reported (''Kernel file mismatch between grub and the environment description''). This is a sanity check. | |||
If the grub commands are not available in the deployed environment image, as a fallback the grub commands provided in the kadeploy MiniOS (system running while deploying) will be used for the installation. In that case, a very simple grub configuration will be generated (unlike when <code class=command>grub-mkconfig</code> is used) that just uses the ''kernel'', ''initrd'' and ''kernel params'' from your environment description. | |||
Note that having grub installed and configured with the commands that are provided in the deployed environment is the preferred way because it will be isofunctional whenever the deployed system later has upgrades that impact the kernel or grub (or if the bootloader configuration command <code class=command>grub-mkconfig</code> or <code class=command>update-grub</code> is called). | |||
=== | Whenever relevant, by using the ''--custom-variable'' option of kadeploy or by adding custom variables to the environment description, you can trigger a change of the behavior of the bootloader installation step, as such: | ||
* | * <code class=replace>BOOTLOADER_SKIP=1</code>, do nothing (grub not installed nor configured) | ||
* <code class=replace>BOOTLOADER_NO_CONFIG=1</code>, do not configure grub | |||
* <code class=replace>BOOTLOADER_NO_INSTALL=1</code>, do not install grub | |||
* <code class=replace>BOOTLOADER_NO_GRUB_FROM_DEST=1</code>, do not configure nor install using grub from your deployed env but from the kadeploy MiniOS (setting the 2 next variables is equivalent) | |||
* <code class=replace>BOOTLOADER_NO_GRUB_MKCONFIG_FROM_DEST=1</code>, do not configure using grub mkconfig from your deployed environment, but just create a basic grub configuration file using the environment description information (kernel, initrd and commandline). | |||
* <code class=replace>BOOTLOADER_NO_GRUB_INSTALL_FROM_DEST=1</code> do not install using grub from your deployed env but from the kadeploy MiniOS | |||
* <code class=replace>BOOTLOADER_NO_UUID=1</code> do not use filesystem UUID to identify the root partition | |||
* <code class=replace>BOOTLOADER_SHOW_MENU=1</code> enable the grub menu on the console | |||
For example, use | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> my_env --custom-variable </code><code class=replace>BOOTLOADER_NO_INSTALL=1</code>}} to not install grub. | |||
Or likewise, it can be added to the environment description (yaml file): | |||
custom_variables: | |||
<code class=replace>BOOTLOADER_NO_INSTALL: '1'</code> | |||
= | = Customizing the postinstalls = | ||
In Kadeploy3, postinstalls are scripts that are executed after the copy of the image file in order to customize site-specific or cluster-specific aspects. | |||
Since the beginning on 2018, on Grid'5000 the same postinstall script (called g5k-postinstall) is used for all reference environments (and is thus compatible with all supported Debian versions and distributions). That script takes parameters in order to define its behaviour (for example, to choose the style of network configuration to use). | |||
== Using g5k-postinstall == | |||
The source code for g5k-postinstall is [https://gitlab.inria.fr/grid5000/g5k-postinstall/tree/master/g5k-postinstall available on gitlab.inria.fr]. Its parameters at the time of writing are: | |||
<syntaxhighlight lang="text"> | |||
Usage: g5k-postinstall [options] | |||
Options: | |||
-d, --debug Run in debug mode, with output to terminal | |||
-v, --version Print g5k-postinstall version | |||
-n, --net n1,n2,n3 Network configuration specification | |||
-f, --fstab f1,f2,f3 Filesystems configuration specification | |||
-r, --restrict-user MODE User restriction mode | |||
--inittab PATTERN Configure console in inittab | |||
--bootloader b1,b2,b3 Set the kadeploy bootloader step setup options (deprecated) | |||
--no-ref-api Do not use the Reference API | |||
--disable-hacks h1,h2,h3 Hacks to disable | |||
--no-guix Disable guix | |||
--disk-aliases Enable disk aliases (e.g. /dev/disk0p1) | |||
Valid tags for network specification: | |||
debian write config in /etc/network/interfaces | |||
debian-bridged write config in /etc/network/interfaces, with a bridge | |||
setup (for the std env) | |||
netplan write config in /etc/netplan/01-netcfg.yaml | |||
(https://wiki.ubuntu.com/Netplan) | |||
redhat write config in /etc/NetworkManager/system-connections/* | |||
traditional-names use traditional NIC naming (e.g. eth0) instead of | |||
predictable | |||
predictable_kernel_name use predictable name determined by the kernel | |||
force-ref-api-names force the use of the name provided in the reference API | |||
(used by default) | |||
hpc add support for HPC (eg InfiniBand) interfaces | |||
Example: --net debian-bridged,traditional-names,hpc | |||
Valid tags for filesystems configuration in fstab: | |||
nfs include generic NFS mounts (/home with autofs, /grid5000) | |||
no-autofs do not use autofs, just mount the user's NFS directory | |||
no-uuid do not use filesystem UUID to identify the partitions, use the | |||
block device filenames | |||
custom include custom mounts for custom partitionning. Need fstab | |||
file included in postinstall archive. | |||
Example: --fstab nfs | |||
Valid modes for user restriction: | |||
and | std if deployed on production partition, restrict to root,oar. else, | ||
restrict to the current user (see below) | |||
current restrict to root and the user having currently reserved the node | |||
login:jdoe restrict to a specific login (e.g. jdoe) | |||
none no restriction (DEFAULT) | |||
Inittab option: | |||
Needed for non-systemd systems. | |||
Example of pattern: s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100 | |||
Where SPEED and TTYSX are replaced by g5k-postinstall using information | |||
retieved from the parameters which are passed in /proc/cmdline. | |||
No reference API option: | |||
Do not use the Reference API. This is useful during initial configuration of | |||
new clusters. | |||
Disable hacks option: | |||
g5k-postinstall includes hacks that can optionally be disabled. | |||
Current hacks are: | |||
- oot-i40e install the i40e driver on chifflot and chiclet (lille) and | |||
grappe (nancy) on debian9 environment. | |||
- force-net-name on several clusters, the predictable network interface name | |||
can change depending on udev version. This hack forces the | |||
predictable network interface name in such cases. | |||
- beegfs-gr520 configure beegfs shares on grcinq and grvingt (nancy). | |||
Example: --disable-hacks oot-i40e,force-net-name | |||
This | Disk aliases option: | ||
g5k-postinstall can optionally install udev rules to provide predictable disk aliases. | |||
Example of disk aliases automatically created by these udev rules: | |||
/dev/disk0 -> /dev/sdb | |||
/dev/disk0p2 -> /dev/sdb2 | |||
/dev/disk1 -> /dev/nvme0n1 | |||
/dev/disk1p4 -> /dev/nvme0n1p4 | |||
This is useful because, starting from Linux 5.3, traditional block device names | |||
are non-deterministic: /dev/sda might not refer to the same disk on every boot. | |||
</syntaxhighlight> | |||
An example environment description using g5k-postinstall is: | |||
<syntaxhighlight lang="yaml" highlight='14-17'> | |||
--- | |||
name: debian11-min | |||
version: 2021092316 | |||
arch: x86_64 | |||
description: debian 11 (bullseye) for x64 - min | |||
author: pierre.neyron@imag.fr | |||
visibility: private | |||
destructive: false | |||
os: linux | |||
image: | |||
file: server:///grid5000/images/debian11-x64-min-2021092316.tar.zst | |||
kind: tar | |||
compression: zstd | |||
postinstalls: | |||
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz | |||
compression: gzip | |||
script: g5k-postinstall --net debian --disk-aliases | |||
boot: | |||
kernel: "/vmlinuz" | |||
initrd: "/initrd.img" | |||
kernel_params: '' | |||
filesystem: ext4 | |||
partition_type: 131 | |||
multipart: false | |||
</syntaxhighlight> | |||
Things that you can do from there: | |||
* | * Use different parameters to change the behaviour of the postinstall. Example parameters for various situations are: | ||
*: | ** Debian min environment with traditional NIC naming: <tt><code class=command>g5k-postinstall</code> --net debian --net traditional-names</tt> | ||
** Debian min environment with predictable NIC naming: <tt><code class=command>g5k-postinstall</code> --net debian</tt> | |||
*: | ** Debian min environment with predictable disk aliases: <tt><code class=command>g5k-postinstall</code> --net debian --disk-aliases</tt> | ||
** Debian NFS environment (mount /home, setup LDAP, restrict login to user who reserved the node): <tt><code class=command>g5k-postinstall</code> --net debian --fstab nfs --restrict-user current</tt> | |||
** Debian big environment (NFS + setup HPC networks and mount site-specific directories): <tt><code class=command>g5k-postinstall</code> --net debian --net traditional-names --net hpc --fstab nfs --fstab site-specific</tt> | |||
** Use GPT label (e.g. /dev/disk/by-partlabel/KDPL_DEPLOY_disk0) instead of UUID to identify the root partition in the kernel command line: <tt><code class=command>g5k-postinstall</code> --bootloader no-uuid</tt> | |||
** Use GPT label (e.g. /dev/disk/by-partlabel/KDPL_DEPLOY_disk0) instead of UUID to identify the partitions in the fstab: <tt><code class=command>g5k-postinstall</code> --fstab no-uuid</tt> | |||
** RHEL/Centos style for network configuration: <tt><code class=command>g5k-postinstall</code> --net redhat --net traditional-names</tt> | |||
** Ubuntu 1710 or later: NetPlan for network configuration: <tt><code class=command>g5k-postinstall</code> --net netplan</tt> | |||
* | ** Do not do any network configuration (useful for Gentoo), but force serial console settings: <tt><code class=command>g5k-postinstall</code> --inittab='s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100'</tt> | ||
* Use a customized version of g5k-postinstall: after building a modified g5k-postinstall, just point the postinstalls/archive/ field to the new tar archive. See [https://gitlab.inria.fr/grid5000/g5k-postinstall/blob/master/g5k-postinstall/README.md README.md] and [[TechTeam:Postinstalls]] for details on g5k-postinstall internals. | |||
* Add an additional postinstall to execute after g5k-postinstall. That way, the additionnal postinstall can be written in any language (e.g. just a shell script) and complete what is already done by g5k-postinstall. See below. | |||
== Adding an extra postinstall == | |||
Modifying <code class="command">g5k-postinstall</code> may for some purpose not be relevant (or overkill), for instance if actions are independent and can be executed after <code class="command">g5k-postinstall</code> is run. In such a case, it is very easy to provide an additional postinstall that for instance may just be a shell script: | |||
( | |||
= | Assuming <code class="file">additional-postinstall.sh</code> is a script located at the root of the <code class="file">/home/jdoe/public/debiantesting-x64-additional-postinstall.tar.gz</code> archive, we just have to declare the additional postinstall in the environment description as follows: | ||
<syntaxhighlight lang="yaml" highlight="22-24"> | |||
--- | --- | ||
author: John Doe | |||
boot: | |||
initrd: /initrd.img | |||
kernel: /vmlinuz | |||
description: debian testing with some customizations | |||
destructive: false | destructive: false | ||
filesystem: ext4 | |||
image: | |||
compression: gzip | |||
file: local:///home/jdoe/public/debiantesting-x64-custom.tar.gz | |||
kind: tar | |||
multipart: false | |||
name: debiantesting-custom | |||
arch: x86_64 | |||
os: linux | os: linux | ||
partition_type: 131 | |||
postinstalls: | |||
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz | |||
compression: gzip | compression: gzip | ||
script: g5k-postinstall --net debian --fstab nfs --restrict-user current | |||
- archive: local:///home/jdoe/public//debiantesting-x64-additional-postinstall.tar.gz | |||
compression: gzip | compression: gzip | ||
boot: | script: additional-postinstall.sh | ||
version: 2020071009 | |||
visibility: shared | |||
</syntaxhighlight> | |||
Both postinstalls will execute during the deployment, the one after the other. | |||
</ | |||
= Tuning the Kadeploy3 deployment workflow = | |||
<code class="command">kadeploy3</code> allows to fully modify the deployment workflow. | |||
First of all you have to understand the different steps of a deployment. There are 3 macro-steps: | |||
# <code class="env">SetDeploymentMiniOS</code>: this step aims at setting up the deployment environment that contains all the required tools to perform a deployment ; | |||
# <code class="env">BroadcastEnv</code>: this step aims at broadcasting the new environment to the nodes and writing it to disk; | |||
# <code class="env">BootNewEnv</code>: this step aims at rebooting the nodes on their new environment. | |||
<code class="command">kadeploy3</code> provides several implementations for each of those 3 macro-steps. You can consult that list in the kadeploy3 page. | |||
In Grid'5000, we use the following steps by default in all our clusters: | |||
* <code class="env">SetDeploymentMiniOS</code> -> <code class=file>SetDeploymentMiniOSTrusted</code>: use kexec to boot to embedded deployment environment (this only works from the standard Grid'5000 environment, so only for the first deployment of a job. Otherwise a <code class=file>SetDeploymentMiniOSUntrusted</code>, classical reboot, will be performed) | |||
* <code class="env">BroadcastEnv</code> -> <code class=file>BroadcastEnvKascade</code>: use the Kascade tool to broadcast the environment | |||
* <code class="env">BootNewEnv</code> -> <code class=file>BootNewEnvKexec</code>: the nodes use kexec to reboot (if it fails, a <code class=file>BootNewEnvClassical</code>, classical reboot, will be performed) | |||
Each one of these implementations is divided in micro steps. You can can see the name of those micro-steps if you use the kadeploy3 option <code>--verbose-level 4</code>. And to see what is actually executed during those micro-steps you can add the debug option of kadeploy3 <code>-d</code> | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-base --verbose-level 4 -d > <code class=file>~/kadeploy3_steps</code>}} | |||
This command will store the kadeploy3 standard output in the file <code class=file>~/kadeploy3_steps</code>. Lets analyse its content: | |||
{{Term|location=frontend|cmd=<code class="command">grep</code> "Time in" <code class=file>~/kadeploy3_steps</code>}} | |||
This command will print on the terminal all the micro-steps executed during the deployment process, and the time spent for each execution. Here are the micro-steps that you should see in a typical deployment: | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">switch_pxe</code>: Configures the PXE server so that this node will boot on an environment that contains all the required tools to perform the deployment | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">send_deployment_kernel</code>: Send the files (kernel and initrd) of the Deployment Mini OS | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">set_default_vlan</code>: Set the node on the default vlan. If a "--vlan" option was used, the node will be place on the specified vlan on a later step. | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">kexec</code>: Reboot the environnement using kexec | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">wait_reboot</code>: Waits for the node to restart. | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">startup_script</code>: Startup script of the Deployment Mini OS. | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">send_key_in_deploy_env</code>: Sends kadeploy's user's ssh public key into the node's authorized_keys to ease the following ssh connections, | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">create_partition_table</code>: Creates the partition table | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">format_deploy_part</code>: Format the partition where your environment will be installed. | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">mount_deploy_part</code>: Mounts the deployment partition in a local directory. | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">format_tmp_part</code>: Format the partition defined as tmp | |||
# <code class=file>SetDeploymentMiniOSTrusted</code>-<code class="replace">format_swap_part</code>: Format the swap partition | |||
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">send_environment</code>: Sends your environments into the node and untar it into the deployment partition. | |||
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">manage_admin_post_install</code>: Execute post installation instructions defined by the site admins, in general to adapt to the specificities of the cluster: console baud rate, Infiniband,... | |||
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">manage_user_post_install</code>: Execute user defined post installation instructions to automatically configure its node depending on its cluster, site, network capabilities, disk capabilities,... | |||
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">send_key</code>: Sends the user public ssh key(s) to the node | |||
# <code class=file>BroadcastEnvKascade</code>-<code class="replace">install_bootloader</code>: Properly configures the bootloader | |||
# <code class=file>BootNewEnvKexec</code>-<code class="replace">switch_pxe</code>: Configure the PXE server so that this node will boot on the partition where your environment has been installed | |||
# <code class=file>BootNewEnvKexec</code>-<code class="replace">umount_deploy_part</code>: Umount the deployment partition from the directory where it has been mounted during the step 7. | |||
# <code class=file>BootNewEnvKexec</code>-<code class="replace">mount_deploy_part</code>: ReMount the deployment partition | |||
# <code class=file>BootNewEnvKexec</code>-<code class="replace">kexec</code>: Perform a kexec reboot on the node | |||
# <code class=file>BootNewEnvKexec</code>-<code class="replace">set_vlan</code>: Properly configure the node's VLAN if the --vlan option was used | |||
# <code class=file>BootNewEnvKexec</code>-<code class="replace">wait_reboot</code>: Wait for the node to be up. | |||
That is it. You now know all the default micro-steps used to deploy your environments. | |||
== Environment boot: grub vs. kexec == | |||
On most clusters, kadeploy boots the user environment using <code class=command>kexec</code> straight away from the kadeploy MiniOS (operating system while deploying), after the image deployment and the postinstall steps. With <code class=command>kexec</code>, Linux serves as a bootloader for itself. This means that the node is not "cold" rebooted, thus the classical bootloader (grub) is not used in that "warm" reboot process. | |||
Whenever the node will be rebooted after the deployment, it will however rather use the classical bootloader (grub) than <code class=command>kexec</code> (unless <code class=command>kexec</code> is also configured to do so in the user environment). | |||
You may look at the <code class=command>kadeploy3</code> command output to see if <code class=command>kexec</code> is indeed used at the end of the deployment. | |||
If you think the use of <code class=command>kexec</code> may introduce a bias and prefer a classical cold reboot at the end of the deployment, you can either change the kadeploy workflow (see below) or just use the <code class=command>--no-kexec</code> option of the <code class=command>kadeploy3</code> command. | |||
== Adjusting timeout for some environments == | |||
Since <code class="command">kadeploy3</code> provides multiple macro-steps and micro-steps, its is important to detect when a step in failing its execution. This error detection is done by using timeout on each step. When a timeout is reached, the nodes that have not completed the given step are discarded from the deployment process.<br> | |||
The value of those timeouts varies from one cluster to another since they depend on the hardware configuration (network speed, hard disk speed, reboot speed, ...). | |||
All defaults timeouts are entered in the configurations files on the kadeploy3 server. But you can consult the default timeouts of each macro-steps by using the command <code class="command">kastat3</code> | |||
{{Term|location=frontend|cmd=<code class="command">kastat3</code> -I}} | |||
Kadeploy server configuration: | |||
Custom PXE boot method: PXElinux | |||
Automata configuration: | |||
hercule: | |||
SetDeploymentMiniOS: SetDeploymentMiniOSUntrusted,1,600 | |||
BroadcastEnv: BroadcastEnvKascade,0,1000 | |||
BootNewEnv: BootNewEnvKexec,0,180; BootNewEnvHardReboot,0,900 | |||
nova: | |||
SetDeploymentMiniOS: SetDeploymentMiniOSUntrusted,1,600 | |||
BroadcastEnv: BroadcastEnvKascade,0,1000 | |||
BootNewEnv: BootNewEnvKexec,0,150; BootNewEnvHardReboot,0,600 | |||
... | |||
<code class="command">kadeploy3</code> allow users to change timeouts in the command line. In some cases, when you try to deploy an environment with a large tarball or with a post-install that lasts too long, you may get discarded nodes. This false positive behavior can be avoided by manually modifying the timeouts for each step at the deployment time. | |||
For instance, in our previous example, the timeout of each steps are: | |||
* <code class=file>SetDeploymentMiniOSUntrusted</code>: 143 | |||
* | * <code class=file>BroadcastEnvKascade</code>: 111 | ||
* | * <code class=file>BootNewEnvKexec</code>: 33 | ||
You can increase the timeout of the second step to 1200 seconds with the following command: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> my_big_env --force-steps "SetDeploymentMiniOS|SetDeploymentMiniOSUntrusted:1:450&BroadcastEnv|BroadcastEnvKascade:1:1200&BootNewEnv|BootNewEnvClassical:1:400"}} | |||
{{Term|location=frontend|cmd=<code class="command"> | |||
== | == Set Break-Point during deployment == | ||
As mentioned in the section above, a deployment is a succession of micro steps that can be consulted and modified.<br> | |||
{{Term|location=frontend|cmd=<code class="command"> | Moreover, <code class="command">kadeploy3</code> allows user to set a break-point during deployment. | ||
; Examples: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> - | * Breakpoint before the postinstall is run: | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-base --verbose-level 4 -d --breakpoint <code class=file>BroadcastEnvKascade</code>:<code class="replace">manage_user_post_install</code>}} | |||
This command can be used for debugging purpose. It performs a deployment with the maximum verbose level and it asks to stop the deployment workflow just '''''before''''' executing the ''manage_user_post_install'' micro-step of the ''BroadcastEnvKascade'' macro-step. Thus you will be able to connect in the deployment environment and to manually run the user post install script to debug it. | |||
* Breakpoint before rebooting to the deployed environment: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-base --verbose-level 4 -d --breakpoint <code class=file>BootNewEnv</code>}} | |||
Stops before rebooting to the deployed env. Mind that rebooting the machine will boot again the kadeploy MiniOS, not the deployed system. | |||
{{Warning|text=At the current state of <code class="command">kadeploy3</code>, it is not possible to resume the deployment from the break-point step. Thus you will have to redeploy you environment from the first step. This feature may be implemented in future version of <code class="command">kadeploy3</code>.}} | |||
== | == Modify the deployment workflow with custom operations == | ||
In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom ''pre, post or substitute operations'' to each steps. In a custom operation it's possible to: ''send'' a file, ''execute'' a command or ''run'' a script. | In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom ''pre, post or substitute operations'' to each steps. In a custom operation it's possible to: ''send'' a file, ''execute'' a command or ''run'' a script. | ||
This feature in explained in Kadeploy3's documentation (available on [https:// | This feature in explained in Kadeploy3's documentation (available on [https://kadeploy.gitlabpages.inria.fr/ Kadeploy3's website]) in the section ''4.2.2, Use Case 10'' and ''4.7''. | ||
This is illustrated in the following sub-sections. | |||
{{Note|text=When running a custom script, Kadeploy will export different variables, you can get a list of them by running <code class="command">kadeploy3 -I</code>.<br>A description of each of this variables is available in Kadeploy3's documentation ([https://kadeploy.gitlabpages.inria.fr/ on Kadeploy3 website]) in the section ''4.4''}} | |||
=== Format additional disks === | |||
In this example, we will add some custom operations to the deployment workflow: our nodes have two additional hard disks and we want them to be formated during the deployment process. | In this example, we will add some custom operations to the deployment workflow: our nodes have two additional hard disks and we want them to be formated during the deployment process. | ||
We want to a new partition scheme such as: | We want to a new partition scheme such as: | ||
* classical grid5000 partitioning on | * classical grid5000 partitioning on disk0 | ||
* data1 ext4 on | * data1 ext4 on disk1p1 | ||
* data2 | * data2 ext4 on disk2p1 | ||
{{Note|text=We do not use the disk legacy naming scheme here (sda,b,c) as new kernels do not guarantee the stability of disk naming across reboots}} | |||
The three following sections describe how to perform such an operation. | The three following sections describe how to perform such an operation. | ||
;1. Make the reservation in destructive mode | |||
First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the <code class="command">-t destructive</code> parameter: | First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the <code class="command">-t destructive</code> parameter: | ||
{{Term|location=frontend|cmd=<code class="command">oarsub</code>-t deploy -t destructive -l nodes=1,walltime= | {{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=2 -p hercule -I}} | ||
;2. Describe the custom operations | |||
After that you have to create a file that describe the custom operations you want to be performed during the deployment. | After that you have to create a file that describe the custom operations you want to be performed during the deployment. | ||
In our example we will first | In our example we will first re-partition the additional disks (using parted) and then format them (using the script format.sh). | ||
* The operation description file (let's say '''custom-partitioning.yml''') should look like something like this: | * The operation description file (let's say '''custom-partitioning.yml''') should look like something like this: | ||
< | <syntaxhighlight lang="yaml"> | ||
--- | --- | ||
# Our custom steps should be performed during the | # Our custom steps should be performed during the SetDeploymentMiniOS macro-step | ||
SetDeploymentMiniOS: | |||
# Custom partitioning step | # Custom partitioning step that is substitued to the create_partition_table micro-step | ||
create_partition_table: | create_partition_table: | ||
post-ops: | |||
# We send partition files on the node | |||
- action: send | |||
file: disk1.parted | |||
# The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy | |||
destination: $KADEPLOY_TMP_DIR | |||
name: send_partition_map_disk1 | |||
# Then we execute the parted command using the previously sent | - action: send | ||
file: disk2.parted | |||
destination: $KADEPLOY_TMP_DIR | |||
name: send_partition_map_disk2 | |||
# Then we execute the parted command using the previously sent files | |||
- action: exec | |||
name: partitioning_disk1 | |||
command: parted -a optimal -s $(jq -r '.storage_devices[]|select(.id=="disk1")|.by_path' /tmp/grid5000-ref-api.json) $(cat $KADEPLOY_TMP_DIR/disk1.parted) | |||
- action: exec | |||
name: partitioning_disk2 | |||
command: parted -a optimal -s $(jq -r '.storage_devices[]|select(.id=="disk2")|.by_path' /tmp/grid5000-ref-api.json) $(cat $KADEPLOY_TMP_DIR/disk2.parted) | |||
# Custom format step, done after the format_deploy_part micro-step | # Custom format step, done after the format_deploy_part micro-step | ||
format_deploy_part: | format_deploy_part: | ||
post-ops: | |||
# We run the script contained in the file 'format.sh' | |||
- action: run | |||
name: format_partitions | |||
file: format.sh | |||
</ | </syntaxhighlight> | ||
* The file ''' | * The file '''disk1.parted''' will look like something like this: | ||
< | <syntaxhighlight lang="bash"> | ||
mklabel msdos | mklabel msdos | ||
u GB mkpart primary ext4 0% 100% | u GB mkpart primary ext4 0% 100% | ||
align-check optimal 1 | align-check optimal 1 | ||
</ | </syntaxhighlight> | ||
* The file ''' | * The file '''disk2.parted''' will look like something like this: | ||
< | <syntaxhighlight lang="bash"> | ||
mklabel msdos | mklabel msdos | ||
u GB mkpart primary | u GB mkpart primary ext4 0% 100% | ||
align-check optimal 1 | align-check optimal 1 | ||
</ | </syntaxhighlight> | ||
* The file '''format.sh''' will look like something like this: | * The file '''format.sh''' will look like something like this: | ||
< | <syntaxhighlight lang="bash"> | ||
set -e | set -e | ||
# formating | # formating disk1 | ||
mkfs -t ext4 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q / | mkfs -F -t ext4 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q $(jq -r '.storage_devices[]|select(.id=="disk1")|.by_path' /tmp/grid5000-ref-api.json)-part1 | ||
# formating | # formating disk2 | ||
mkfs -t | mkfs -F -t ext4 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q $(jq -r '.storage_devices[]|select(.id=="disk2")|.by_path' /tmp/grid5000-ref-api.json)-part1 | ||
</ | </syntaxhighlight> | ||
{{Note|text=The usage of jq in the above scripts or commands allows us to get the persistent device names of disks corresponding to disk0,1,2,... It is too early in the deploy process to use simpler names such as /dev/disk1p1 because they are created by the postinstall which is called in an ulterior step}} | |||
;3. Run the deployment | |||
Now you can deploy you environment with this custom operation: | Now you can deploy you environment with this custom operation: | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> | {{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-min --custom-steps ./custom-partitioning.yml}} | ||
{{Warning|text=In some | {{Warning|text=In some cases (for instance if you want to use ext2 instead of ext4 in the example above, as the formatting process is way much longer using ext2 than ext4) you should increase the step timeout. See [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}} | ||
'''Note:''' Both partitions are not mounted on boot. To mount those partitions you should do: | '''Note:''' Both partitions are not mounted on boot. To mount those partitions you should do: | ||
{{Term|location=NODE|cmd=<code class="command">mkdir</code> -p /media/data1}} | {{Term|location=NODE|cmd=<code class="command">mkdir</code> -p /media/data1}} | ||
{{Term|location=NODE|cmd=<code class="command">mkdir</code> /media/data2}} | {{Term|location=NODE|cmd=<code class="command">mkdir</code> /media/data2}} | ||
{{Term|location=NODE|cmd=<code class="command">mount</code> /dev/ | {{Term|location=NODE|cmd=<code class="command">mount</code> /dev/disk1p1 /media/data1}} | ||
{{Term|location=NODE|cmd=<code class="command">mount</code> /dev/ | {{Term|location=NODE|cmd=<code class="command">mount</code> /dev/disk2p2 /media/data2}} | ||
=== Use a custom partitioning scheme === | |||
==== Example 1: Deploy on the whole disk ==== | |||
In this example, we will modify the deployment workflow to deploy the system on a unique disk partition ( '/' on sda1 ) | |||
;1. Make the reservation in destructive mode | |||
As you will change partitioning of the disk, you must tell to OAR that it should redeploy the node entirely after the reservation with the <code class="command">-t destructive</code> parameter: | |||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=2 -I}} | |||
;2. Describe the custom operations | |||
After that you have to create a file that describe the custom operations you want to be performed during the deployment. | |||
In this example we will create our custom partitioning scheme and bypass some steps that are not necessary to deploy the system on a unique partition. | |||
* The operation description file (let's say '''custom-partitioning.yml''') should look like something like this: | |||
<syntaxhighlight lang="yaml"> | |||
--- | |||
# Our custom steps should be performed during the SetDeploymentMiniOS macro-step | |||
SetDeploymentMiniOS: | |||
# Custom partitioning step that is substituted to the create_partition_table micro-step | |||
create_partition_table: | |||
substitute: | |||
# We send a file on the node | |||
- action: send | |||
file: map.parted | |||
# The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy | |||
destination: $KADEPLOY_TMP_DIR | |||
name: send_partition_map | |||
# Then we execute the parted command using the previously sent file | |||
- action: exec | |||
name: partitioning | |||
# The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy | |||
command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted) | |||
# Hack to disable useless steps | |||
format_tmp_part: | |||
substitute: | |||
- action: exec | |||
name: remove_format_tmp_part_step | |||
command: /bin/true | |||
format_swap_part: | |||
substitute: | |||
- action: exec | |||
name: remove_format_swap_part_step | |||
command: /bin/true | |||
</syntaxhighlight> | |||
* The file '''map.parted''', which will be passed to '''parted''', will look like this: | |||
<syntaxhighlight lang="bash"> | |||
mklabel gpt | |||
mkpart KDPL_SYSTEM_disk0 ext4 0% 100% | |||
toggle 1 boot | |||
align-check optimal 1 | |||
</syntaxhighlight> | |||
{{Note|text=Kadeploy identify the partitions using GPT label. The form of the label should be <code>KDPL_</code><code class="replace">name</code><code>_</code><code class="replace">disk</code>. So the label <code>KDPL_SYSTEM_disk1</code> will be use on a deployment {{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env.yml -b disk1 -p SYSTEM -k --custom-steps custom-partitioning.yml}} The default disk on Grid'5000 is <code>disk0</code> and the partition is <code>DEPLOY</code>, so by default, kadeploy look for the label <code>KDPL_DEPLOY_disk0</code>}} | |||
;3. Customize the environment's postinstall | |||
In | In order for our new partitions to be mounted at boot time we will modify the Grid'5000 postinstall files. | ||
* Create and go in your public directory: | |||
{{Term|location=frontend|cmd=mkdir public/custom-postinstall && cd public/custom-postinstall}} | |||
* Then decompress the postinstall archive: | |||
{{Term|location=frontend|cmd=<code class="command">tar</code> xzf <code class="replace">/grid5000/postinstalls/g5k-postinstall.tgz</code>}} | |||
* Add your custom /etc/fstab file in this directory, named '''fstab''': | |||
<syntaxhighlight lang="bash"> | |||
PARTLABEL=KDPL_SYSTEM_disk0 / ext4 defaults 1 2 | |||
</syntaxhighlight> | |||
When you will pass "--fstab custom" option to the postinstall, it will copy this file in /etc/fstab | |||
* Regenerate the postinstall archive: | |||
{{Term|location=frontend|cmd=<code class="command">tar</code> -czvf <code class="replace">~/public/g5k-postinstall-custom.tgz</code> *}} | |||
* Make some cleanup: | |||
* Create the environment's description file (let's say '''custom-env.dsc''') based on the reference one: | |||
** use kaenv3 -p debian10-base to have an example of environment description. | |||
Your '''custom-env.dsc''' should look like this: | |||
<syntaxhighlight lang="yaml"> | |||
--- | |||
name: custom-env | |||
version: 1 | |||
arch: x86_64 | |||
description: Custom env based on Debian 10 | |||
author: me@domain.tld | |||
visibility: shared | |||
destructive: true | |||
os: linux | |||
image: | |||
file: server:///grid5000/images/debian10-x64-base-2020012812.tgz | |||
kind: tar | |||
compression: gzip | |||
postinstalls: | |||
- archive: http://public/~<login>/g5k-postinstall-custom.tgz | |||
compression: gzip | |||
script: g5k-postinstall --net debian --fstab custom | |||
boot: | |||
kernel: "/vmlinuz" | |||
initrd: "/initrd.img" | |||
filesystem: ext4 | |||
partition_type: 131 | |||
multipart: false | |||
</syntaxhighlight> | |||
;4. Run the deployment | |||
Finally, we deploy our custom environment with your custom operations: | |||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env.dsc -p SYSTEM --custom-steps custom-partitioning.yml}} | |||
{{Note|text=In some case you should increase the step timeout (for some long formatting for example) see [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}} | |||
==== Example 2: Deploy on multiple partitions ==== | |||
In this example, we will modify the deployment workflow: a different partition will be used for each of the ''/'', ''/home'', ''/opt'' and ''/tmp'' directories. | |||
Imagine that you want to make your own partitioning scheme like that: | Imagine that you want to make your own partitioning scheme like that: | ||
{| class="wikitable" | |||
|- | |||
! Mount point !! Partition !! Disk space !! File System | |||
|- | |||
| swap || SWAP || 2G || linux-swap | |||
|- | |||
| / || SYSTEM || 18G || ext4 | |||
|- | |||
| /var || VAR || 30G || ext4 | |||
|- | |||
| /opt || OPT || 20G || ext4 | |||
|- | |||
| /tmp || TMP || ''everything else'' || ext4 | |||
|} | |||
The four following sections describe how to perform such an operation. | |||
;1. Make the reservation in destructive mode | |||
First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the <code class="command">-t destructive</code> parameter: | First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the <code class="command">-t destructive</code> parameter: | ||
{{Term|location=frontend|cmd=<code class="command">oarsub</code>-t deploy -t destructive -l nodes=1,walltime= | {{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=2 -I}} | ||
;2. Describe the custom operations | |||
After that you have to create a file that describe the custom operations you want to be performed during the deployment. | After that you have to create a file that describe the custom operations you want to be performed during the deployment. | ||
In our example we will first create apply our custom partitioning scheme, format the partition and the mount them. | In our example we will first create apply our custom partitioning scheme, format the partition and the mount them. | ||
* The operation description file (let's say '''custom-partitioning.yml''') should look like something like this: | * The operation description file (let's say '''custom-partitioning.yml''') should look like something like this: | ||
< | <syntaxhighlight lang="yaml"> | ||
--- | --- | ||
# Our custom steps should be performed during the | # Our custom steps should be performed during the SetDeploymentMiniOS macro-step | ||
SetDeploymentMiniOSUntrusted: | |||
# Custom partitioning step that is | # Custom partitioning step that is substituted to the create_partition_table micro-step | ||
create_partition_table: | create_partition_table: | ||
substitute: | substitute: | ||
Line 426: | Line 737: | ||
- action: send | - action: send | ||
file: map.parted | file: map.parted | ||
# The variable $KADEPLOY_TMP_DIR will be | # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy | ||
destination: $KADEPLOY_TMP_DIR | destination: $KADEPLOY_TMP_DIR | ||
name: send_partition_map | name: send_partition_map | ||
Line 432: | Line 743: | ||
- action: exec | - action: exec | ||
name: partitioning | name: partitioning | ||
# The variable $KADEPLOY_TMP_DIR will be | # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy | ||
command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted) | command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted) | ||
# Custom format step, done after the format_deploy_part micro-step | # Custom format step, done after the format_deploy_part micro-step | ||
Line 441: | Line 752: | ||
name: format_partitions | name: format_partitions | ||
file: format.sh | file: format.sh | ||
# Custom mount step, done after | # Custom mount step, done after the mount_deploy_part micro-step | ||
mount_deploy_part: | mount_deploy_part: | ||
post-ops: | post-ops: | ||
Line 448: | Line 759: | ||
name: mount_partitions | name: mount_partitions | ||
file: mount.sh | file: mount.sh | ||
</syntaxhighlight> | |||
</ | |||
{{Note|text=In order for Kadeploy to be able to perform the installation correctly, every partitions have to be mounted before the installation process which is done in the macro-step BroadcastEnv}} | {{Note|text=In order for Kadeploy to be able to perform the installation correctly, every partitions have to be mounted before the installation process which is done in the macro-step BroadcastEnv}} | ||
* The file '''map.parted''' will look like something like this: | * The file '''map.parted''' will look like something like this: | ||
< | <syntaxhighlight lang="bash"> | ||
mklabel | mklabel gpt | ||
u GB mkpart | u GB mkpart KDPL_SWAP_disk0 linux-swap 0% 2 | ||
u GB mkpart | u GB mkpart KDPL_SYSTEM_disk0 ext4 2 20 | ||
u GB mkpart | u GB mkpart KDPL_VAR_disk0 ext4 20 50 | ||
u GB mkpart | u GB mkpart KDPL_OPT_disk0 ext4 50 70 | ||
u GB mkpart KDPL_TMP_disk0 ext4 70 100% | |||
u GB mkpart | |||
toggle 2 boot | toggle 2 boot | ||
align-check optimal 1 | align-check optimal 1 | ||
Line 477: | Line 775: | ||
align-check optimal 4 | align-check optimal 4 | ||
align-check optimal 5 | align-check optimal 5 | ||
</syntaxhighlight> | |||
</ | |||
{{Note|text=Kadeploy identify the partitions using GPT label. The form of the label should be <code>KDPL_</code><code class="replace">name</code><code>_</code><code class="replace">disk</code>. So the label <code>KDPL_SYSTEM_disk1</code> will be use on a deployment {{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env.yml -b disk1 -p SYSTEM --custom-steps custom-partitioning.yml}} The default disk on Grid'5000 is <code>disk0</code> and the partition is <code>DEPLOY</code>, so by default, kadeploy look for the label <code>KDPL_DEPLOY_disk0</code>. The <code>KDPL_SWAP_</code><code class="replace">disk</code> is also detected as the swap partition and will be formated by kadeploy when detected.}} | |||
* The file '''format.sh''' will look like something like this: | * The file '''format.sh''' will look like something like this: | ||
< | <syntaxhighlight lang="bash"> | ||
#!/bin/sh | #!/bin/sh | ||
set -e | set -e | ||
mkfs_opts="sparse_super,filetype,resize_inode,dir_index" | mkfs_opts="sparse_super,filetype,resize_inode,dir_index" | ||
ext4_blocksize="4096" | |||
# / will be formated by Kadeploy since we will precise the -p SYSTEM option | |||
# / will be formated by Kadeploy since we will precise the -p | # formating /var | ||
# formating / | mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q /dev/disk/by-partlabel/KDPL_VAR_disk0 | ||
mkfs -t ext4 -b | # formating /opt | ||
# formating / | mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q /dev/disk/by-partlabel/KDPL_OPT_disk0 | ||
mkfs -t ext4 -b | # formating /tmp | ||
# formating / | mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q /dev/disk/by-partlabel/KDPL_TMP_disk0 | ||
mkfs -t ext4 -b | </syntaxhighlight> | ||
{{Note|text=When running a custom script, Kadeploy will export different variables, you can get a list of them by running "kadeploy -I".}} | |||
</ | |||
{{Note|text=Kadeploy will export different variables | |||
* The file '''mount.sh''' will look like something like this: | * The file '''mount.sh''' will look like something like this: | ||
< | <syntaxhighlight lang="bash"> | ||
#!/bin/sh | #!/bin/sh | ||
set -e | set -e | ||
# / will be mounted in ${KADEPLOY_ENV_EXTRACTION_DIR} by Kadeploy | # / will be mounted in ${KADEPLOY_ENV_EXTRACTION_DIR} by Kadeploy | ||
# mount /var | # mount /var | ||
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/var | mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/var | ||
mount | mount /dev/disk/by-partlabel/KDPL_VAR_disk0 ${KADEPLOY_ENV_EXTRACTION_DIR}/var/ | ||
# mount / | # mount /opt | ||
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/ | mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/opt | ||
mount | mount /dev/disk/by-partlabel/KDPL_OPT_disk0 ${KADEPLOY_ENV_EXTRACTION_DIR}/opt/ | ||
# mount /tmp | # mount /tmp | ||
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp | mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp | ||
mount | mount /dev/disk/by-partlabel/KDPL_TMP_disk0 ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp/ | ||
</ | </syntaxhighlight> | ||
;3. Customize the environment's postinstall | |||
In order for our new partitions to be mounted at boot time we can modify the Grid'5000 postinstall files (this customization can also be done by adding another custom operation). | In order for our new partitions to be mounted at boot time we can modify the Grid'5000 postinstall files (this customization can also be done by adding another custom operation). | ||
* | * Create and go in a temporary directory: | ||
{{Term|location=frontend|cmd=<code class="command"> | {{Term|location=frontend|cmd=<code class="command">tmpdir=</code>$(mktemp -d) && <code class="command">export</code> tmpdir && <code class="command">pushd</code> $tmpdir}} | ||
* Then decompress the postinstall archive: | |||
{{Term|location=frontend|cmd=<code class="command">tar</code> xzf <code class="replace">/grid5000/postinstalls/g5k-postinstall.tgz</code>}} | |||
{{Note|text=We assume that the current shell is BASH, if not please replace the "export" instruction}} | |||
{{Term|location=frontend|cmd=<code class="command">tar</code> | * Add your custom /etc/fstab file in this temporary directory, named '''fstab''': | ||
<syntaxhighlight lang="bash"> | |||
PARTLABEL=KDPL_SWAP_disk0 none swap sw 0 0 | |||
{{ | PARTLABEL=KDPL_SYSTEM_disk0 / ext4 defaults 0 0 | ||
PARTLABEL=KDPL_VAR_disk0 /var ext4 defaults 1 2 | |||
PARTLABEL=KDPL_OPT_disk0 /opt ext4 defaults 1 2 | |||
PARTLABEL=KDPL_TMP_disk0 /tmp ext4 defaults 1 2 | |||
</syntaxhighlight> | |||
/ will be added by Kadeploy since we will precise the <code class="command">-p SYSTEM</code> option | |||
/ | |||
</ | |||
* Regenerate the postinstall archive: | * Regenerate the postinstall archive: | ||
{{Term|location=frontend|cmd=<code class="command">tar</code> - | {{Term|location=frontend|cmd=<code class="command">tar</code> -czvf <code class="replace">~/g5k-postinstall-custom.tgz</code> *}} | ||
* Make some cleanup: | * Make some cleanup: | ||
{{Term|location=frontend|cmd= | {{Term|location=frontend|cmd=<code class="command">popd</code> && <code class="command">rm</code> -R $tmpdir}} | ||
* Create the environment's description file (let's say '''custom-env.yml''') based on the reference one: | * Create the environment's description file (let's say '''custom-env.yml''') based on the reference one: | ||
{{Term|location=frontend|cmd=<code class="command">kaenv3</code> -p | {{Term|location=frontend|cmd=<code class="command">kaenv3</code> -p debian11-base -u deploy <nowiki>|</nowiki> sed -e "s/archive:.*$/archive: <code class="replace">\/home\/${USER}\/g5k-postinstall-custom.tgz</code>/" -e 's/public/shared/' > custom-env.yml}} | ||
and customize the '''custom-env.yml''' file to suit your needs (especially your archive path): | |||
< | |||
<syntaxhighlight lang="yaml"> | |||
--- | --- | ||
name: custom-env | name: custom-env | ||
version: 1 | version: 1 | ||
description: Custom env based on | description: Custom env based on Debian 10 | ||
author: me@domain.tld | author: me@domain.tld | ||
visibility: shared | visibility: shared | ||
destructive: | destructive: true | ||
os: linux | os: linux | ||
image: | image: | ||
file: server:///grid5000/images/debian11-base-2021092316.tar.zst | |||
kind: tar | kind: tar | ||
compression: zstd | |||
postinstalls: | |||
- archive: /home/me/g5k-postinstall-custom.tgz | |||
compression: gzip | compression: gzip | ||
script: g5k-postinstall --net debian --fstab custom | |||
boot: | |||
- | kernel: "/vmlinuz" | ||
initrd: "/initrd.img" | |||
boot: | |||
kernel: /vmlinuz | |||
initrd: /initrd.img | |||
filesystem: ext4 | filesystem: ext4 | ||
partition_type: 131 | partition_type: 131 | ||
</ | multipart: false | ||
</syntaxhighlight> | |||
{{Warning|text=Do not forget the <code class="command">--fstab custom</code> option to g5k-postinstall.}} | |||
;4. Run the deployment | |||
Finally, we deploy our custom environment with your custom operations: | Finally, we deploy our custom environment with your custom operations: | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env. | {{Term|location=frontend|cmd=<code class="command">kadeploy3</code> -a custom-env.yml -p SYSTEM --custom-steps custom-partitioning.yml}} | ||
{{ | {{Note|text=In some case you should increase the step timeout (for some long formatting for example) see [[Advanced_Kadeploy#Adjusting timeout for some environments]] for details.}} | ||
=== Boot a custom kernel with kexec === | |||
= | In this example, we change some custom operations of the '''reboot''' workflow: our nodes will use kexec to start a custom kernel that we upload beforehand, instead of rebooting. This allows us to boot a custom kernel without deploying a whole system (just use the <code class=command>kareboot3</code> command, not the <code class=command>kadeploy3</code> one). | ||
;1. Make the reservation in destructive mode | |||
First of all | First of all, when we do our reservation, we have to tell OAR that it must redeploy the node entirely after our reservation. For this, we use the <code class="command">-t destructive</code> parameter: | ||
{{Term|location=frontend|cmd=<code class="command">oarsub</code> -t deploy -t destructive -l nodes=1,walltime=2 -p hercule -I}} | |||
;2. Describe the custom operations | |||
Next, we must create a file that describes custom operations to perform during the reboot step. | |||
In our example, we first send kernel and initrd files to the nodes, then run kexec using them. | |||
A operation description file (here '''custom-kexec-reboot.yml''') to perform those actions looks like this: | |||
<syntaxhighlight lang="yaml"> | |||
--- | |||
Simple: | |||
reboot: | |||
pre-ops: | |||
- action: send | |||
file: /home/me/custom_kernel.vmlinuz | |||
destination: $KADEPLOY_TMP_DIR | |||
name: send_custom_kernel | |||
- action: send | |||
file: /home/me/custom_kernel.initrd.img | |||
destination: $KADEPLOY_TMP_DIR | |||
name: send_custom_initrd | |||
substitute: | |||
- action: exec | |||
name: kexec | |||
# nohup to be sure the script exit correctly before the kexec start | |||
command: kexec -l $KADEPLOY_TMP_DIR/custom_krenel.vmlinuz --initrd=$KADEPLOY_TMP_DIR/custom_kernel.initrd.img && nohup /bin/sh -c 'sleep 1; systemctl kexec' 1>/dev/null 2>/dev/null </dev/null & | |||
</syntaxhighlight> | |||
;3. Run the reboot | |||
Finally, we call <code class=command>kareboot3</code> with our custom operations: | |||
{{Term|location=frontend|cmd=<code class="command">kareboot3</code> simple --custom-steps custom-kexec-reboot.yml}} | |||
=== Skip the first reboot/kexec === | |||
If you are already on the deployment kernel (by using as breakpoint or rebooting with <code class="command">kareboot3 -r deploy_env</code>) you may want to skip the first reboot/kexec. This can be done by masking the reboot/kexec via custom operations. You need to be in a destructive job to be able to use custom operations (<code class="command">oarsub -t deploy -t destructive</code>)).Write a file with your custom operation, eg '''custom-kexec-reboot.yml''': | |||
<syntaxhighlight lang="yaml"> | |||
--- | |||
SetDeploymentMiniOS: | |||
reboot: | |||
substitute: | |||
- action: exec | |||
name: nothing | |||
command: 'true' | |||
kexec: | |||
substitute: | |||
- action: exec | |||
name: nothing | |||
command: 'true' | |||
</syntaxhighlight> | |||
== | Start the deployment with our custom operations: | ||
{{Term|location=frontend|cmd=<code class="command">kadeploy3</code> debian11-min --custom-steps custom-kexec-reboot.yml }} | |||
= FAQ = | |||
== My environment does not work on all clusters == | |||
It some rare occasions, an environment may not work on a given cluster: | |||
# The kernel used does not support all hardware. You are advised to base your environment on one of the reference environments to avoid dealing with this or to carefully read the hardware section of each site to see the list of kernel drivers that need to be compiled in your environment for it to be able to boot on all clusters. Of course, when a new cluster is integrated, you might need to update your kernel for portability. | |||
# The post-installation scripts do not recognize your environment, and therefore network access, console access, or site-specific configurations are not taken into account. You can check the contents of the default post-installation scripts to see the variables set by kadeploy by looking at the environment's description using kaenv. | |||
== Kadeploy fails with ''Image file not found!'' == | |||
This means that <code>kadeploy</code> is not able to read your environment's main archive. This can be caused by many reasons, i.e: | |||
* registered filename is wrong | |||
* extension is not right (for example <code>.tar.gz</code> does not work, whereas <code>.tgz</code> is OK) | |||
== Kadeploy is complaining about a node already involved in another deployment== | |||
The warning you see is: | |||
node <code class="replace">node</code> is already involved in another deployment | |||
This error occurs: | |||
* When 2 concurrent deployments are attempted on the same node. If you have 2 simultaneous deployments, make sure you have 2 distinct sets of nodes. | |||
* When there is a problem in the kadeploy database: typically when a deployment ended in a strange way, this can happen. The best is to wait for about 15 minutes and retry the deployment: kadeploy can correct its database automatically. | |||
== How do I exit from kaconsole on cluster X from site Y == | |||
You can try '''&''' then '''.''' sequence (just like typing '''&.'''), but this may not work on all clusters. The [[Kaconsole#Escape_sequence_for_every_site|Kaconsole]] page may give you more information. | |||
== How to deploy the ''std'' environment in a deploy job == | |||
Some use cases may exist where the user wants to deploy the ''std'' environment (e.g, <code class=file>debian11-std</code>), which runs on nodes of job that are not of type deploy. | |||
However, the ''std'' environment is not listed in the output of <code class=command>kaenv3 -l</code> or the G5K API equivalent, nor known straight away by the <code class=command>kadeploy3</code> command. | |||
First of all, one has to mind if the actual need is indeed to deploy the ''std'' environment: it is encouraged to instead deploy the ''big'' environment (e.g. <code class=file>debian11-big</code>) which provides all features of the ''std'' environment except those just necessary to non-deploy jobs (e.g. the OAR services). | |||
Then, if deploying the ''std'' environment is really necessary, one just has to use the <code class=command>-u deploy</code> option in the kadeploy commands. | |||
Note that it is also possible to specify an older version of the environment using the <code class=command>--env-version</code> option, for instance whenever the need is to reproduce some tests done in a non-deploy job that was running an old version of the ''std'' environment. |
Latest revision as of 16:04, 21 October 2024
![]() |
Note |
---|---|
This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team. |
![]() |
Warning |
---|---|
Please see the Environment creation guide, which gives automated mechanisms to build kadeploy environnements |
What you need to know before starting
The first thing to understand is that by using kadeploy3, you will be running a command that attempts to remotely reboot possibly many nodes at the same time, and boot them over the network using configuration files hosted on a server.
What is an Environment?
Where we describe what exactly is image
, kernel
, initrd
and postinstall
An environment in kadeploy3
is a set of
file describing a fully functional Operating System.
To be able to setup a Operating System, kadeploy3
needs at least 4 files in the most common cases
- An image
- An image is a file containing all the Operating System files. It is a compressed archive (compressed tarball).
- A kernel file
- Specifies the kernel file to boot. The full file path in the target system is expected.
- An initrd file (optional)
- For the Linux systems, the
initrd
orinitramfs
file is the initial ramdisk used by the kernel as a preliminary system before the root filesytem is mounted. More information: Initrd on Wikipedia. The full file path in the target system is expected.
- For the Linux systems, the
- A postinstall file (optional)
- The postinstall file allows for adapting the deployed environment to specificities of each site and cluster, as well as setting up some platform-specific services.
Once you have this set of files, you can describe your environment to kadeploy3
. This description represents an environment
in the kadeploy3
sense.
Note: the kadeploy software also supports deploying environments made of binary disk images (dd format) but this is not supported on Grid'5000.
How can I make my own environment?
To create our own environment, there are two main ways:
- One way is to deploy an existing environment, customize it and save it with
tgz-g5k
, but that is not really a good reproducible way of working. - The other way, which favors reproducibility, is to build the environment from a recipe, using
kameleon
, just like the Grid'5000 supported reference environments. See the Environment creation tutorial for details.
Disk partitioning
Environments are deployed on the (first) local disk of each node (each Grid'5000 node is equipped with at least one hard drive, HDD or SSD). The following partitioning scheme is used:
Label | Role | Size |
---|---|---|
SWAP
|
Linux swap | 4 GB |
PROD
|
Standard environment (default environment installed on nodes) | 31 GB |
DEPLOY
|
User's environment (when Kadeploy is used) | 34 GB |
EFI
|
For UEFI systems: UEFI system partition (store boot loaders) | 1 GB |
TMP
|
Made available in /tmp
|
Remaining disk space |
- Notes
- Some clusters have more than one local disk. On some clusters, those disks can be reserved apart from the node itself. See Disk reservation to find out how to use them.
- Kadeploy allows deployments on other partitions than
DEPLOY
or even creating a custom partitioning scheme. See below for details.
![]() |
Warning |
---|---|
If your image is too big to be deployed on the |
Search and deploy an existing environment
Search an environment
Grid'5000 maintains several reference environments directly available for deployment in all sites. These environments are based on the Debian, Ubuntu, or Centos Linux distribution.
For Debian, different variants of reference environments are offered. For Ubuntu and Centos, only environments with a minimal system are offered.
They are called reference environments because they can be used to generate customized environments and because they are provided and supported by the Grid'5000 technical team. The full list of reference environments is:
Last generated from the Grid'5000 API on 2024-11-20
Kadeploy provides a registry of environments in each site, where reference environments are registered along with environments of users.
For reference environments, associated filesystem images are stored in the /grid5000
directory of the frontend
.
To deploy a registered environment, you must know its name as registered in the Kadeploy registry. This tutorial uses the debian11-base
environment.
You can also list all available environment in a site by using the kaenv3
command:
This command lists all public as well as your private environments.
We distinguish three levels of visibility for an environment:
- public: Only administrators can register public environments. They are shown and used by default by the
kaenv3
andkadeploy3
commands unless a user is specified (with-u
user
).
- shared: Shared user environments. They are shown and used by the
kaenv3
andkadeploy3
commands when a user is specified with-u
user
.
- private: The environment is only shown and usable by the user the environment belongs to.
For example, a shared environment added by user user
is listed this way:
You can also look for a specific version with the --env-version
version option. Most of the versions of the reference environments images files are available in /grid5000/images
. The version number is the last part of the image file.
For instance: debian11-min-2021092316.tar.zst
is the image file of the debian11-min
reference environment version 2021092316
.
Whenever you want to deploy this specific image on some nodes, use:
Being able to reproduce a past experiment is a desirable feature. Therefore, you should always try to control as much as possible the environment the experiment is done in. Therefore, we will attempt to check that the environment that was chosen in the environment directory is the one available on a given cluster. On the cluster you would like to deploy, type the following command to print information about an environment:
Adding -u deploy
makes sure we get the description of the environment supported by the Grid'5000 staff, as deploy is the user owning them (one might have registred his own environment with the same, that would show up if no user is specified).
In theory, you should also check the post-install script. A post-install script adapts an environment to the site it is deployed on.
If everything seems ok, please proceed to the next step.
Make a job on a deployable node
By default, Grid'5000 nodes are running on the production environment, which already contains most of the important features and can be used to run experiments. This environment however imposes choices and has some limitations due to its general-purpose target. While you can gain the root privileges on it thanks to the sudo-g5k
command, you cannot reboot for instance, as this ends your reservation. Deploying on the contrary allows to have a dedicated environment and to have full control over the machine (reboot, serial console, ...).
For this part of the tutorial, jobs made will be interactive (-I
), of the deploy type (-t deploy
), on only one machine (-l nodes=1
) to do environment customization (we will give ourselves 3 hours with -l walltime=3
), which gives us the following command, that will open a new shell session on the frontend node:
Indeed, when you submit a job of the deploy type, a new shell is opened on the frontend node and not on the first machine of the job as for standard jobs. When you exit from this shell, the job ends. The shell is populated with OAR_*
environment variables. You should look at the list of available variables to get an idea of the information you can use to script deployment later. As usual, if the job is successful, you will get the name of the machine allocated to your job with:
Deploy a reference environment
By default, Kadeploy use all the nodes of the reservation. It is done by using the file whose name is given by the $OAR_FILE_NODES
environment variable (or $OAR_NODE_FILE
, or $OAR_NODEFILE
)
So, to start the deployment on all the nodes, run the following command:
You can also just provide the nodes to deploy on, with the -m option:
By default, Kadeploy will copy your ~/.ssh/authorized_keys
and replace the /root/.ssh/authorized_keys
file on the deployed nodes.
Alternatively, you can use the -k
option in two ways:
- You can either specify the public key that will be copied in
/root/.ssh/authorized_keys
on the deployed nodes:
- Or to not copy any key on the deployed node with
-k none
. You will need to provide a password to connect. However, SSH is often configured to disallow root login using password. The root password for all reference environments (i.e. provided by the Grid'5000 technical team) isgrid5000
.
In our case, the node file contains only 1 node.
Once the kadeploy command was executed successfully, the deployed node runs the debian11-base
environment as their operating system. It will then be possible to tune this environment according to your needs.
Connect to the deployed environment and customize it
- 1. Connection
On reference environments managed by the Grid'5000 technical team, you can use the root
account for log in with ssh
(kadeploy checks that sshd is running before declaring a deployment successful). To connect to the node type:
In case this doesn't work, please take a look at the kadeploy section of the Sidebar > FAQ
- 2. Adding software to an environment
You can alter your environment (to add missing libraries that you need, or remove packages that you don't need ; to reduce the size of the image and speed up the deployment process ; etc.) using commands such as:
![]() |
node :
|
apt-get update apt-get upgrade apt-get install list of desired packages and libraries apt-get --purge remove list of unwanted packages apt-get clean |
Create a new environment from a node's customized operating system
We now need to save this customized environment, where you have a user account, to be able to use this account again each time you deploy it.
The first step to create an environment is to create an archive of the node's operating system you just customized. You can use tgz-g5k
to extract a Grid'5000 environment tarball from a running node. It's usage is describe in the Environment creation tutorial page.
Advanced deployment options
Multisite deployment
In order to achieve a deployment on nodes from different sites, you can use the multiserver option of kadeploy, using the -M
option.
Deploy on other partition of disk(s)
Kadeploy default handling of partitions may be too limited for some usages. One may need to use disks differently. Kadeploy offers several options to deploy on another existing partition of the primary disk, or if required to repartition disks entirely and/or use several disks (on nodes with many disks).
Deploy on partition labeled PROD or TMP
As seen earlier, kadeploy manages to label portions to identify them more easily, e.g. PROD, DEPLOY, or TMP. Kadeploy can be instructed to deploy on the PROD or TMP partition instead of the DEPLOY partition.
Because this kind of deployment will break some node standard operations after the end of your job, you must add to your oarsub
command for the job creation the -t destructive
option. This will cause the node to be completely reinstalled after your job.
Then you can deploy on PROD or TMP with the -p PROD
or -p TMP
option:
Deploy on secondary disks
Kadeploy identifies disks by their id (disk0, disk1...). You can find them in the Reference API or in the Hardware pages:
Because this kind of deployment will break some node standard operations after the end of your job, you must add to your oarsub
command for the job creation the -t destructive
option. This will cause the node to be completely reinstalled after your job.
Then you can deploy on an secondary disk such as disk1
with the -b disk1
option:
Disks can also be handled differently by modifying the deployment automata, see below.
About the kernel and bootloader
As a reminder, a deployed environment may be booted either with kexec (e.g. at the end of the deployment) or with grub (e.g. after the deployment, calling kareboot
, or the reboot
command from the node itself, or whenever kexec is deactivated/not supported).
By default, the bootloader installation step of kadeploy will install and configure grub using the grub commands provided by your environment. The grub configuration file will be generated with grub-mkconfig
, which will boot by default the most recent kernel version available in the /boot
directory of your environment. If that version does not match the kernel version provided in the environment description, an error will be reported (Kernel file mismatch between grub and the environment description). This is a sanity check.
If the grub commands are not available in the deployed environment image, as a fallback the grub commands provided in the kadeploy MiniOS (system running while deploying) will be used for the installation. In that case, a very simple grub configuration will be generated (unlike when grub-mkconfig
is used) that just uses the kernel, initrd and kernel params from your environment description.
Note that having grub installed and configured with the commands that are provided in the deployed environment is the preferred way because it will be isofunctional whenever the deployed system later has upgrades that impact the kernel or grub (or if the bootloader configuration command grub-mkconfig
or update-grub
is called).
Whenever relevant, by using the --custom-variable option of kadeploy or by adding custom variables to the environment description, you can trigger a change of the behavior of the bootloader installation step, as such:
BOOTLOADER_SKIP=1
, do nothing (grub not installed nor configured)BOOTLOADER_NO_CONFIG=1
, do not configure grubBOOTLOADER_NO_INSTALL=1
, do not install grubBOOTLOADER_NO_GRUB_FROM_DEST=1
, do not configure nor install using grub from your deployed env but from the kadeploy MiniOS (setting the 2 next variables is equivalent)BOOTLOADER_NO_GRUB_MKCONFIG_FROM_DEST=1
, do not configure using grub mkconfig from your deployed environment, but just create a basic grub configuration file using the environment description information (kernel, initrd and commandline).BOOTLOADER_NO_GRUB_INSTALL_FROM_DEST=1
do not install using grub from your deployed env but from the kadeploy MiniOSBOOTLOADER_NO_UUID=1
do not use filesystem UUID to identify the root partitionBOOTLOADER_SHOW_MENU=1
enable the grub menu on the console
For example, use
to not install grub.
Or likewise, it can be added to the environment description (yaml file):
custom_variables:
BOOTLOADER_NO_INSTALL: '1'
Customizing the postinstalls
In Kadeploy3, postinstalls are scripts that are executed after the copy of the image file in order to customize site-specific or cluster-specific aspects. Since the beginning on 2018, on Grid'5000 the same postinstall script (called g5k-postinstall) is used for all reference environments (and is thus compatible with all supported Debian versions and distributions). That script takes parameters in order to define its behaviour (for example, to choose the style of network configuration to use).
Using g5k-postinstall
The source code for g5k-postinstall is available on gitlab.inria.fr. Its parameters at the time of writing are:
Usage: g5k-postinstall [options]
Options:
-d, --debug Run in debug mode, with output to terminal
-v, --version Print g5k-postinstall version
-n, --net n1,n2,n3 Network configuration specification
-f, --fstab f1,f2,f3 Filesystems configuration specification
-r, --restrict-user MODE User restriction mode
--inittab PATTERN Configure console in inittab
--bootloader b1,b2,b3 Set the kadeploy bootloader step setup options (deprecated)
--no-ref-api Do not use the Reference API
--disable-hacks h1,h2,h3 Hacks to disable
--no-guix Disable guix
--disk-aliases Enable disk aliases (e.g. /dev/disk0p1)
Valid tags for network specification:
debian write config in /etc/network/interfaces
debian-bridged write config in /etc/network/interfaces, with a bridge
setup (for the std env)
netplan write config in /etc/netplan/01-netcfg.yaml
(https://wiki.ubuntu.com/Netplan)
redhat write config in /etc/NetworkManager/system-connections/*
traditional-names use traditional NIC naming (e.g. eth0) instead of
predictable
predictable_kernel_name use predictable name determined by the kernel
force-ref-api-names force the use of the name provided in the reference API
(used by default)
hpc add support for HPC (eg InfiniBand) interfaces
Example: --net debian-bridged,traditional-names,hpc
Valid tags for filesystems configuration in fstab:
nfs include generic NFS mounts (/home with autofs, /grid5000)
no-autofs do not use autofs, just mount the user's NFS directory
no-uuid do not use filesystem UUID to identify the partitions, use the
block device filenames
custom include custom mounts for custom partitionning. Need fstab
file included in postinstall archive.
Example: --fstab nfs
Valid modes for user restriction:
std if deployed on production partition, restrict to root,oar. else,
restrict to the current user (see below)
current restrict to root and the user having currently reserved the node
login:jdoe restrict to a specific login (e.g. jdoe)
none no restriction (DEFAULT)
Inittab option:
Needed for non-systemd systems.
Example of pattern: s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100
Where SPEED and TTYSX are replaced by g5k-postinstall using information
retieved from the parameters which are passed in /proc/cmdline.
No reference API option:
Do not use the Reference API. This is useful during initial configuration of
new clusters.
Disable hacks option:
g5k-postinstall includes hacks that can optionally be disabled.
Current hacks are:
- oot-i40e install the i40e driver on chifflot and chiclet (lille) and
grappe (nancy) on debian9 environment.
- force-net-name on several clusters, the predictable network interface name
can change depending on udev version. This hack forces the
predictable network interface name in such cases.
- beegfs-gr520 configure beegfs shares on grcinq and grvingt (nancy).
Example: --disable-hacks oot-i40e,force-net-name
Disk aliases option:
g5k-postinstall can optionally install udev rules to provide predictable disk aliases.
Example of disk aliases automatically created by these udev rules:
/dev/disk0 -> /dev/sdb
/dev/disk0p2 -> /dev/sdb2
/dev/disk1 -> /dev/nvme0n1
/dev/disk1p4 -> /dev/nvme0n1p4
This is useful because, starting from Linux 5.3, traditional block device names
are non-deterministic: /dev/sda might not refer to the same disk on every boot.
An example environment description using g5k-postinstall is:
---
name: debian11-min
version: 2021092316
arch: x86_64
description: debian 11 (bullseye) for x64 - min
author: pierre.neyron@imag.fr
visibility: private
destructive: false
os: linux
image:
file: server:///grid5000/images/debian11-x64-min-2021092316.tar.zst
kind: tar
compression: zstd
postinstalls:
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
compression: gzip
script: g5k-postinstall --net debian --disk-aliases
boot:
kernel: "/vmlinuz"
initrd: "/initrd.img"
kernel_params: ''
filesystem: ext4
partition_type: 131
multipart: false
Things that you can do from there:
- Use different parameters to change the behaviour of the postinstall. Example parameters for various situations are:
- Debian min environment with traditional NIC naming:
g5k-postinstall
--net debian --net traditional-names - Debian min environment with predictable NIC naming:
g5k-postinstall
--net debian - Debian min environment with predictable disk aliases:
g5k-postinstall
--net debian --disk-aliases - Debian NFS environment (mount /home, setup LDAP, restrict login to user who reserved the node):
g5k-postinstall
--net debian --fstab nfs --restrict-user current - Debian big environment (NFS + setup HPC networks and mount site-specific directories):
g5k-postinstall
--net debian --net traditional-names --net hpc --fstab nfs --fstab site-specific - Use GPT label (e.g. /dev/disk/by-partlabel/KDPL_DEPLOY_disk0) instead of UUID to identify the root partition in the kernel command line:
g5k-postinstall
--bootloader no-uuid - Use GPT label (e.g. /dev/disk/by-partlabel/KDPL_DEPLOY_disk0) instead of UUID to identify the partitions in the fstab:
g5k-postinstall
--fstab no-uuid - RHEL/Centos style for network configuration:
g5k-postinstall
--net redhat --net traditional-names - Ubuntu 1710 or later: NetPlan for network configuration:
g5k-postinstall
--net netplan - Do not do any network configuration (useful for Gentoo), but force serial console settings:
g5k-postinstall
--inittab='s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100'
- Debian min environment with traditional NIC naming:
- Use a customized version of g5k-postinstall: after building a modified g5k-postinstall, just point the postinstalls/archive/ field to the new tar archive. See README.md and TechTeam:Postinstalls for details on g5k-postinstall internals.
- Add an additional postinstall to execute after g5k-postinstall. That way, the additionnal postinstall can be written in any language (e.g. just a shell script) and complete what is already done by g5k-postinstall. See below.
Adding an extra postinstall
Modifying g5k-postinstall
may for some purpose not be relevant (or overkill), for instance if actions are independent and can be executed after g5k-postinstall
is run. In such a case, it is very easy to provide an additional postinstall that for instance may just be a shell script:
Assuming additional-postinstall.sh
is a script located at the root of the /home/jdoe/public/debiantesting-x64-additional-postinstall.tar.gz
archive, we just have to declare the additional postinstall in the environment description as follows:
---
author: John Doe
boot:
initrd: /initrd.img
kernel: /vmlinuz
description: debian testing with some customizations
destructive: false
filesystem: ext4
image:
compression: gzip
file: local:///home/jdoe/public/debiantesting-x64-custom.tar.gz
kind: tar
multipart: false
name: debiantesting-custom
arch: x86_64
os: linux
partition_type: 131
postinstalls:
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
compression: gzip
script: g5k-postinstall --net debian --fstab nfs --restrict-user current
- archive: local:///home/jdoe/public//debiantesting-x64-additional-postinstall.tar.gz
compression: gzip
script: additional-postinstall.sh
version: 2020071009
visibility: shared
Both postinstalls will execute during the deployment, the one after the other.
Tuning the Kadeploy3 deployment workflow
kadeploy3
allows to fully modify the deployment workflow.
First of all you have to understand the different steps of a deployment. There are 3 macro-steps:
SetDeploymentMiniOS
: this step aims at setting up the deployment environment that contains all the required tools to perform a deployment ;BroadcastEnv
: this step aims at broadcasting the new environment to the nodes and writing it to disk;BootNewEnv
: this step aims at rebooting the nodes on their new environment.
kadeploy3
provides several implementations for each of those 3 macro-steps. You can consult that list in the kadeploy3 page.
In Grid'5000, we use the following steps by default in all our clusters:
SetDeploymentMiniOS
->SetDeploymentMiniOSTrusted
: use kexec to boot to embedded deployment environment (this only works from the standard Grid'5000 environment, so only for the first deployment of a job. Otherwise aSetDeploymentMiniOSUntrusted
, classical reboot, will be performed)BroadcastEnv
->BroadcastEnvKascade
: use the Kascade tool to broadcast the environmentBootNewEnv
->BootNewEnvKexec
: the nodes use kexec to reboot (if it fails, aBootNewEnvClassical
, classical reboot, will be performed)
Each one of these implementations is divided in micro steps. You can can see the name of those micro-steps if you use the kadeploy3 option --verbose-level 4
. And to see what is actually executed during those micro-steps you can add the debug option of kadeploy3 -d
This command will store the kadeploy3 standard output in the file ~/kadeploy3_steps
. Lets analyse its content:
This command will print on the terminal all the micro-steps executed during the deployment process, and the time spent for each execution. Here are the micro-steps that you should see in a typical deployment:
SetDeploymentMiniOSTrusted
-switch_pxe
: Configures the PXE server so that this node will boot on an environment that contains all the required tools to perform the deploymentSetDeploymentMiniOSTrusted
-send_deployment_kernel
: Send the files (kernel and initrd) of the Deployment Mini OSSetDeploymentMiniOSTrusted
-set_default_vlan
: Set the node on the default vlan. If a "--vlan" option was used, the node will be place on the specified vlan on a later step.SetDeploymentMiniOSTrusted
-kexec
: Reboot the environnement using kexecSetDeploymentMiniOSTrusted
-wait_reboot
: Waits for the node to restart.SetDeploymentMiniOSTrusted
-startup_script
: Startup script of the Deployment Mini OS.SetDeploymentMiniOSTrusted
-send_key_in_deploy_env
: Sends kadeploy's user's ssh public key into the node's authorized_keys to ease the following ssh connections,SetDeploymentMiniOSTrusted
-create_partition_table
: Creates the partition tableSetDeploymentMiniOSTrusted
-format_deploy_part
: Format the partition where your environment will be installed.SetDeploymentMiniOSTrusted
-mount_deploy_part
: Mounts the deployment partition in a local directory.SetDeploymentMiniOSTrusted
-format_tmp_part
: Format the partition defined as tmpSetDeploymentMiniOSTrusted
-format_swap_part
: Format the swap partitionBroadcastEnvKascade
-send_environment
: Sends your environments into the node and untar it into the deployment partition.BroadcastEnvKascade
-manage_admin_post_install
: Execute post installation instructions defined by the site admins, in general to adapt to the specificities of the cluster: console baud rate, Infiniband,...BroadcastEnvKascade
-manage_user_post_install
: Execute user defined post installation instructions to automatically configure its node depending on its cluster, site, network capabilities, disk capabilities,...BroadcastEnvKascade
-send_key
: Sends the user public ssh key(s) to the nodeBroadcastEnvKascade
-install_bootloader
: Properly configures the bootloaderBootNewEnvKexec
-switch_pxe
: Configure the PXE server so that this node will boot on the partition where your environment has been installedBootNewEnvKexec
-umount_deploy_part
: Umount the deployment partition from the directory where it has been mounted during the step 7.BootNewEnvKexec
-mount_deploy_part
: ReMount the deployment partitionBootNewEnvKexec
-kexec
: Perform a kexec reboot on the nodeBootNewEnvKexec
-set_vlan
: Properly configure the node's VLAN if the --vlan option was usedBootNewEnvKexec
-wait_reboot
: Wait for the node to be up.
That is it. You now know all the default micro-steps used to deploy your environments.
Environment boot: grub vs. kexec
On most clusters, kadeploy boots the user environment using kexec
straight away from the kadeploy MiniOS (operating system while deploying), after the image deployment and the postinstall steps. With kexec
, Linux serves as a bootloader for itself. This means that the node is not "cold" rebooted, thus the classical bootloader (grub) is not used in that "warm" reboot process.
Whenever the node will be rebooted after the deployment, it will however rather use the classical bootloader (grub) than kexec
(unless kexec
is also configured to do so in the user environment).
You may look at the kadeploy3
command output to see if kexec
is indeed used at the end of the deployment.
If you think the use of kexec
may introduce a bias and prefer a classical cold reboot at the end of the deployment, you can either change the kadeploy workflow (see below) or just use the --no-kexec
option of the kadeploy3
command.
Adjusting timeout for some environments
Since kadeploy3
provides multiple macro-steps and micro-steps, its is important to detect when a step in failing its execution. This error detection is done by using timeout on each step. When a timeout is reached, the nodes that have not completed the given step are discarded from the deployment process.
The value of those timeouts varies from one cluster to another since they depend on the hardware configuration (network speed, hard disk speed, reboot speed, ...).
All defaults timeouts are entered in the configurations files on the kadeploy3 server. But you can consult the default timeouts of each macro-steps by using the command kastat3
Kadeploy server configuration: Custom PXE boot method: PXElinux Automata configuration: hercule: SetDeploymentMiniOS: SetDeploymentMiniOSUntrusted,1,600 BroadcastEnv: BroadcastEnvKascade,0,1000 BootNewEnv: BootNewEnvKexec,0,180; BootNewEnvHardReboot,0,900 nova: SetDeploymentMiniOS: SetDeploymentMiniOSUntrusted,1,600 BroadcastEnv: BroadcastEnvKascade,0,1000 BootNewEnv: BootNewEnvKexec,0,150; BootNewEnvHardReboot,0,600 ...
kadeploy3
allow users to change timeouts in the command line. In some cases, when you try to deploy an environment with a large tarball or with a post-install that lasts too long, you may get discarded nodes. This false positive behavior can be avoided by manually modifying the timeouts for each step at the deployment time.
For instance, in our previous example, the timeout of each steps are:
SetDeploymentMiniOSUntrusted
: 143BroadcastEnvKascade
: 111BootNewEnvKexec
: 33
You can increase the timeout of the second step to 1200 seconds with the following command:
![]() |
frontend :
|
kadeploy3 my_big_env --force-steps "SetDeploymentMiniOS|SetDeploymentMiniOSUntrusted:1:450&BroadcastEnv|BroadcastEnvKascade:1:1200&BootNewEnv|BootNewEnvClassical:1:400" |
Set Break-Point during deployment
As mentioned in the section above, a deployment is a succession of micro steps that can be consulted and modified.
Moreover, kadeploy3
allows user to set a break-point during deployment.
- Examples
- Breakpoint before the postinstall is run:
![]() |
frontend :
|
kadeploy3 debian11-base --verbose-level 4 -d --breakpoint BroadcastEnvKascade :manage_user_post_install |
This command can be used for debugging purpose. It performs a deployment with the maximum verbose level and it asks to stop the deployment workflow just before executing the manage_user_post_install micro-step of the BroadcastEnvKascade macro-step. Thus you will be able to connect in the deployment environment and to manually run the user post install script to debug it.
- Breakpoint before rebooting to the deployed environment:
Stops before rebooting to the deployed env. Mind that rebooting the machine will boot again the kadeploy MiniOS, not the deployed system.
Modify the deployment workflow with custom operations
In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom pre, post or substitute operations to each steps. In a custom operation it's possible to: send a file, execute a command or run a script.
This feature in explained in Kadeploy3's documentation (available on Kadeploy3's website) in the section 4.2.2, Use Case 10 and 4.7.
This is illustrated in the following sub-sections.
![]() |
Note |
---|---|
When running a custom script, Kadeploy will export different variables, you can get a list of them by running |
Format additional disks
In this example, we will add some custom operations to the deployment workflow: our nodes have two additional hard disks and we want them to be formated during the deployment process.
We want to a new partition scheme such as:
- classical grid5000 partitioning on disk0
- data1 ext4 on disk1p1
- data2 ext4 on disk2p1
![]() |
Note |
---|---|
We do not use the disk legacy naming scheme here (sda,b,c) as new kernels do not guarantee the stability of disk naming across reboots |
The three following sections describe how to perform such an operation.
- 1. Make the reservation in destructive mode
First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive
parameter:
- 2. Describe the custom operations
After that you have to create a file that describe the custom operations you want to be performed during the deployment. In our example we will first re-partition the additional disks (using parted) and then format them (using the script format.sh).
- The operation description file (let's say custom-partitioning.yml) should look like something like this:
---
# Our custom steps should be performed during the SetDeploymentMiniOS macro-step
SetDeploymentMiniOS:
# Custom partitioning step that is substitued to the create_partition_table micro-step
create_partition_table:
post-ops:
# We send partition files on the node
- action: send
file: disk1.parted
# The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy
destination: $KADEPLOY_TMP_DIR
name: send_partition_map_disk1
- action: send
file: disk2.parted
destination: $KADEPLOY_TMP_DIR
name: send_partition_map_disk2
# Then we execute the parted command using the previously sent files
- action: exec
name: partitioning_disk1
command: parted -a optimal -s $(jq -r '.storage_devices[]|select(.id=="disk1")|.by_path' /tmp/grid5000-ref-api.json) $(cat $KADEPLOY_TMP_DIR/disk1.parted)
- action: exec
name: partitioning_disk2
command: parted -a optimal -s $(jq -r '.storage_devices[]|select(.id=="disk2")|.by_path' /tmp/grid5000-ref-api.json) $(cat $KADEPLOY_TMP_DIR/disk2.parted)
# Custom format step, done after the format_deploy_part micro-step
format_deploy_part:
post-ops:
# We run the script contained in the file 'format.sh'
- action: run
name: format_partitions
file: format.sh
- The file disk1.parted will look like something like this:
mklabel msdos
u GB mkpart primary ext4 0% 100%
align-check optimal 1
- The file disk2.parted will look like something like this:
mklabel msdos
u GB mkpart primary ext4 0% 100%
align-check optimal 1
- The file format.sh will look like something like this:
set -e
# formating disk1
mkfs -F -t ext4 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q $(jq -r '.storage_devices[]|select(.id=="disk1")|.by_path' /tmp/grid5000-ref-api.json)-part1
# formating disk2
mkfs -F -t ext4 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q $(jq -r '.storage_devices[]|select(.id=="disk2")|.by_path' /tmp/grid5000-ref-api.json)-part1
- 3. Run the deployment
Now you can deploy you environment with this custom operation:
![]() |
Warning |
---|---|
In some cases (for instance if you want to use ext2 instead of ext4 in the example above, as the formatting process is way much longer using ext2 than ext4) you should increase the step timeout. See Advanced_Kadeploy#Adjusting timeout for some environments for details. |
Note: Both partitions are not mounted on boot. To mount those partitions you should do:
Use a custom partitioning scheme
Example 1: Deploy on the whole disk
In this example, we will modify the deployment workflow to deploy the system on a unique disk partition ( '/' on sda1 )
- 1. Make the reservation in destructive mode
As you will change partitioning of the disk, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive
parameter:
- 2. Describe the custom operations
After that you have to create a file that describe the custom operations you want to be performed during the deployment. In this example we will create our custom partitioning scheme and bypass some steps that are not necessary to deploy the system on a unique partition.
- The operation description file (let's say custom-partitioning.yml) should look like something like this:
---
# Our custom steps should be performed during the SetDeploymentMiniOS macro-step
SetDeploymentMiniOS:
# Custom partitioning step that is substituted to the create_partition_table micro-step
create_partition_table:
substitute:
# We send a file on the node
- action: send
file: map.parted
# The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
destination: $KADEPLOY_TMP_DIR
name: send_partition_map
# Then we execute the parted command using the previously sent file
- action: exec
name: partitioning
# The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
# Hack to disable useless steps
format_tmp_part:
substitute:
- action: exec
name: remove_format_tmp_part_step
command: /bin/true
format_swap_part:
substitute:
- action: exec
name: remove_format_swap_part_step
command: /bin/true
- The file map.parted, which will be passed to parted, will look like this:
mklabel gpt
mkpart KDPL_SYSTEM_disk0 ext4 0% 100%
toggle 1 boot
align-check optimal 1
- 3. Customize the environment's postinstall
In order for our new partitions to be mounted at boot time we will modify the Grid'5000 postinstall files.
- Create and go in your public directory:
- Then decompress the postinstall archive:
- Add your custom /etc/fstab file in this directory, named fstab:
PARTLABEL=KDPL_SYSTEM_disk0 / ext4 defaults 1 2
When you will pass "--fstab custom" option to the postinstall, it will copy this file in /etc/fstab
- Regenerate the postinstall archive:
- Make some cleanup:
- Create the environment's description file (let's say custom-env.dsc) based on the reference one:
- use kaenv3 -p debian10-base to have an example of environment description.
Your custom-env.dsc should look like this:
---
name: custom-env
version: 1
arch: x86_64
description: Custom env based on Debian 10
author: me@domain.tld
visibility: shared
destructive: true
os: linux
image:
file: server:///grid5000/images/debian10-x64-base-2020012812.tgz
kind: tar
compression: gzip
postinstalls:
- archive: http://public/~<login>/g5k-postinstall-custom.tgz
compression: gzip
script: g5k-postinstall --net debian --fstab custom
boot:
kernel: "/vmlinuz"
initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false
- 4. Run the deployment
Finally, we deploy our custom environment with your custom operations:
![]() |
Note |
---|---|
In some case you should increase the step timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details. |
Example 2: Deploy on multiple partitions
In this example, we will modify the deployment workflow: a different partition will be used for each of the /, /home, /opt and /tmp directories. Imagine that you want to make your own partitioning scheme like that:
Mount point | Partition | Disk space | File System |
---|---|---|---|
swap | SWAP | 2G | linux-swap |
/ | SYSTEM | 18G | ext4 |
/var | VAR | 30G | ext4 |
/opt | OPT | 20G | ext4 |
/tmp | TMP | everything else | ext4 |
The four following sections describe how to perform such an operation.
- 1. Make the reservation in destructive mode
First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive
parameter:
- 2. Describe the custom operations
After that you have to create a file that describe the custom operations you want to be performed during the deployment. In our example we will first create apply our custom partitioning scheme, format the partition and the mount them.
- The operation description file (let's say custom-partitioning.yml) should look like something like this:
---
# Our custom steps should be performed during the SetDeploymentMiniOS macro-step
SetDeploymentMiniOSUntrusted:
# Custom partitioning step that is substituted to the create_partition_table micro-step
create_partition_table:
substitute:
# We send a file on the node
- action: send
file: map.parted
# The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
destination: $KADEPLOY_TMP_DIR
name: send_partition_map
# Then we execute the parted command using the previously sent file
- action: exec
name: partitioning
# The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
# Custom format step, done after the format_deploy_part micro-step
format_deploy_part:
post-ops:
# We run the script contained in the file 'format.sh'
- action: run
name: format_partitions
file: format.sh
# Custom mount step, done after the mount_deploy_part micro-step
mount_deploy_part:
post-ops:
# We run the script contained in the file 'format.sh'
- action: run
name: mount_partitions
file: mount.sh
![]() |
Note |
---|---|
In order for Kadeploy to be able to perform the installation correctly, every partitions have to be mounted before the installation process which is done in the macro-step BroadcastEnv |
- The file map.parted will look like something like this:
mklabel gpt
u GB mkpart KDPL_SWAP_disk0 linux-swap 0% 2
u GB mkpart KDPL_SYSTEM_disk0 ext4 2 20
u GB mkpart KDPL_VAR_disk0 ext4 20 50
u GB mkpart KDPL_OPT_disk0 ext4 50 70
u GB mkpart KDPL_TMP_disk0 ext4 70 100%
toggle 2 boot
align-check optimal 1
align-check optimal 2
align-check optimal 3
align-check optimal 4
align-check optimal 5
- The file format.sh will look like something like this:
#!/bin/sh
set -e
mkfs_opts="sparse_super,filetype,resize_inode,dir_index"
ext4_blocksize="4096"
# / will be formated by Kadeploy since we will precise the -p SYSTEM option
# formating /var
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q /dev/disk/by-partlabel/KDPL_VAR_disk0
# formating /opt
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q /dev/disk/by-partlabel/KDPL_OPT_disk0
# formating /tmp
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q /dev/disk/by-partlabel/KDPL_TMP_disk0
![]() |
Note |
---|---|
When running a custom script, Kadeploy will export different variables, you can get a list of them by running "kadeploy -I". |
- The file mount.sh will look like something like this:
#!/bin/sh
set -e
# / will be mounted in ${KADEPLOY_ENV_EXTRACTION_DIR} by Kadeploy
# mount /var
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/var
mount /dev/disk/by-partlabel/KDPL_VAR_disk0 ${KADEPLOY_ENV_EXTRACTION_DIR}/var/
# mount /opt
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/opt
mount /dev/disk/by-partlabel/KDPL_OPT_disk0 ${KADEPLOY_ENV_EXTRACTION_DIR}/opt/
# mount /tmp
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp
mount /dev/disk/by-partlabel/KDPL_TMP_disk0 ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp/
- 3. Customize the environment's postinstall
In order for our new partitions to be mounted at boot time we can modify the Grid'5000 postinstall files (this customization can also be done by adding another custom operation).
- Create and go in a temporary directory:
- Then decompress the postinstall archive:
- Add your custom /etc/fstab file in this temporary directory, named fstab:
PARTLABEL=KDPL_SWAP_disk0 none swap sw 0 0
PARTLABEL=KDPL_SYSTEM_disk0 / ext4 defaults 0 0
PARTLABEL=KDPL_VAR_disk0 /var ext4 defaults 1 2
PARTLABEL=KDPL_OPT_disk0 /opt ext4 defaults 1 2
PARTLABEL=KDPL_TMP_disk0 /tmp ext4 defaults 1 2
/ will be added by Kadeploy since we will precise the -p SYSTEM
option
- Regenerate the postinstall archive:
- Make some cleanup:
- Create the environment's description file (let's say custom-env.yml) based on the reference one:
![]() |
frontend :
|
kaenv3 -p debian11-base -u deploy | sed -e "s/archive:.*$/archive: \/home\/${USER}\/g5k-postinstall-custom.tgz /" -e 's/public/shared/' > custom-env.yml |
and customize the custom-env.yml file to suit your needs (especially your archive path):
---
name: custom-env
version: 1
description: Custom env based on Debian 10
author: me@domain.tld
visibility: shared
destructive: true
os: linux
image:
file: server:///grid5000/images/debian11-base-2021092316.tar.zst
kind: tar
compression: zstd
postinstalls:
- archive: /home/me/g5k-postinstall-custom.tgz
compression: gzip
script: g5k-postinstall --net debian --fstab custom
boot:
kernel: "/vmlinuz"
initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false
- 4. Run the deployment
Finally, we deploy our custom environment with your custom operations:
![]() |
Note |
---|---|
In some case you should increase the step timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details. |
Boot a custom kernel with kexec
In this example, we change some custom operations of the reboot workflow: our nodes will use kexec to start a custom kernel that we upload beforehand, instead of rebooting. This allows us to boot a custom kernel without deploying a whole system (just use the kareboot3
command, not the kadeploy3
one).
- 1. Make the reservation in destructive mode
First of all, when we do our reservation, we have to tell OAR that it must redeploy the node entirely after our reservation. For this, we use the -t destructive
parameter:
- 2. Describe the custom operations
Next, we must create a file that describes custom operations to perform during the reboot step.
In our example, we first send kernel and initrd files to the nodes, then run kexec using them. A operation description file (here custom-kexec-reboot.yml) to perform those actions looks like this:
---
Simple:
reboot:
pre-ops:
- action: send
file: /home/me/custom_kernel.vmlinuz
destination: $KADEPLOY_TMP_DIR
name: send_custom_kernel
- action: send
file: /home/me/custom_kernel.initrd.img
destination: $KADEPLOY_TMP_DIR
name: send_custom_initrd
substitute:
- action: exec
name: kexec
# nohup to be sure the script exit correctly before the kexec start
command: kexec -l $KADEPLOY_TMP_DIR/custom_krenel.vmlinuz --initrd=$KADEPLOY_TMP_DIR/custom_kernel.initrd.img && nohup /bin/sh -c 'sleep 1; systemctl kexec' 1>/dev/null 2>/dev/null </dev/null &
- 3. Run the reboot
Finally, we call kareboot3
with our custom operations:
Skip the first reboot/kexec
If you are already on the deployment kernel (by using as breakpoint or rebooting with kareboot3 -r deploy_env
) you may want to skip the first reboot/kexec. This can be done by masking the reboot/kexec via custom operations. You need to be in a destructive job to be able to use custom operations (oarsub -t deploy -t destructive
)).Write a file with your custom operation, eg custom-kexec-reboot.yml:
---
SetDeploymentMiniOS:
reboot:
substitute:
- action: exec
name: nothing
command: 'true'
kexec:
substitute:
- action: exec
name: nothing
command: 'true'
Start the deployment with our custom operations:
FAQ
My environment does not work on all clusters
It some rare occasions, an environment may not work on a given cluster:
- The kernel used does not support all hardware. You are advised to base your environment on one of the reference environments to avoid dealing with this or to carefully read the hardware section of each site to see the list of kernel drivers that need to be compiled in your environment for it to be able to boot on all clusters. Of course, when a new cluster is integrated, you might need to update your kernel for portability.
- The post-installation scripts do not recognize your environment, and therefore network access, console access, or site-specific configurations are not taken into account. You can check the contents of the default post-installation scripts to see the variables set by kadeploy by looking at the environment's description using kaenv.
Kadeploy fails with Image file not found!
This means that kadeploy
is not able to read your environment's main archive. This can be caused by many reasons, i.e:
- registered filename is wrong
- extension is not right (for example
.tar.gz
does not work, whereas.tgz
is OK)
Kadeploy is complaining about a node already involved in another deployment
The warning you see is:
node node
is already involved in another deployment
This error occurs:
- When 2 concurrent deployments are attempted on the same node. If you have 2 simultaneous deployments, make sure you have 2 distinct sets of nodes.
- When there is a problem in the kadeploy database: typically when a deployment ended in a strange way, this can happen. The best is to wait for about 15 minutes and retry the deployment: kadeploy can correct its database automatically.
How do I exit from kaconsole on cluster X from site Y
You can try & then . sequence (just like typing &.), but this may not work on all clusters. The Kaconsole page may give you more information.
How to deploy the std environment in a deploy job
Some use cases may exist where the user wants to deploy the std environment (e.g, debian11-std
), which runs on nodes of job that are not of type deploy.
However, the std environment is not listed in the output of kaenv3 -l
or the G5K API equivalent, nor known straight away by the kadeploy3
command.
First of all, one has to mind if the actual need is indeed to deploy the std environment: it is encouraged to instead deploy the big environment (e.g. debian11-big
) which provides all features of the std environment except those just necessary to non-deploy jobs (e.g. the OAR services).
Then, if deploying the std environment is really necessary, one just has to use the -u deploy
option in the kadeploy commands.
Note that it is also possible to specify an older version of the environment using the --env-version
option, for instance whenever the need is to reproduce some tests done in a non-deploy job that was running an old version of the std environment.