Revision as of 10:39, 24 September 2020

	Note
	This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.

	Warning
	Please mind also reading the Environment creation guide, which gives automated mechanisms to build kadeploy environnements

What you need to know before starting

The first thing to understand is that by using kadeploy3, you will be running a command that attempts to remotely reboot many nodes at the same time, and boot them using configuration files hosted on a server. On some clusters, there is a failure rate associated with this operation that is not null. You might therefore experience failures on some operations during this tutorial. In this case, retry. The system doesn't retry for you as this implies waiting for long timeouts in all cases, even those where a 90% success rate is sufficient.

What is an Environment?

Where we describe what exactly is image, kernel, initrd and postinstall

An environment in kadeploy3 is a set of file describing a fully functional Operating System. To be able to setup a Operating System, kadeploy3 needs at least 4 files in the most common cases

An image
- An image is a file containing all the Operating System files. It can be a compressed archive (ie tgz file) or a dump of a device (ie dd file). In this tutorial, you will learn to build new images for Kadeploy3
A kernel file
- For the Unix based environment, the kernel file specifies which kernel to boot. It is the full path of the kernel file.
initrd file (optional)
- For the Linux based environment, the optional initrd file allows to use an initial ramdisk which will be used as the root filesytem at the boot sequence. More information: Initrd on Wikipedia
A postinstall file (optional)
- The postinstall file allows you to correctly configure all specificity on each cluster. It is not mandatory to specify it for Kadeploy3 environment but if you know what you are doing, feel free to define it.

Once you have this set of files, you can describe your environment to kadeploy3. This description represents an environment in the kadeploy3 sense.

How can I make my own environment?

To create our own environment, there are two main ways:

One way is to deploy an existing environment, customize it and save it with tgz-g5k
The other one is to built the environment from a recipe, just like the Grid'5000 supported reference environments. See Environment creation.

Disk partitioning

Environments are deployed on the (first) local disk of each node (each Grid'5000 node is equipped with at least one hard drive). The following partitioning scheme is used:

Legacy / DOS

Device	Role
`sda1`	Linux swap
`sda2`	Standard environment (default environment installed on nodes)
`sda3`	User's environment (when Kadeploy is used)
`sda4`	MS-DOS extended partition (container for the logical partitions, cannot be used)
`sda5`	Remaining disk space made available in `/tmp`

GPT / UEFI

This new partition scheme is applied to recent clusters that boot using UEFI (as of 2020: "gemini" and "pyxis" at Lyon)

Device	Role
`sda1`	Linux swap
`sda2`	Standard environment (default environment installed on nodes)
`sda3`	User's environment (when Kadeploy is used)
`sda4`	EFI system partition (used to store boot images)
`sda5`	Remaining disk space made available in `/tmp`

Notes

Some clusters have more than one local disk. On some clusters, those disks can be reserved apart from the node itself. See Disk_reservation to find out how to use them.
Kadeploy allows deployments on other partitions than sda3 or even creating a custom partitioning scheme. See below for details.

Search and deploy an existing environment

Search an environment

Grid'5000 maintains several reference environments directly available in all sites. These environments are based on Debian, Ubuntu and Centos.

For Debian, different variants of reference environments are offered. For Ubuntu and Centos, only environment with a minimal system are offered.

They are called reference environments because they can be used to generate customized environments.

The description of the reference environments can be found here

Getting_Started#Deploying_nodes_with_Kadeploy

An environment registry is maintained in each site (see kaenv3), with the associated filesystem images stored in the /grid5000 directory of the frontend.

To deploy a registered environment, you must know its name as registered in the Kadeploy database. It is the first information on the environment description page. This tutorial uses the debian10-x64-base environment.

You can also list all available environment in a site by using the kaenv3 command:

frontend:

kaenv3 -l

This command lists all public as well as your private environments.

We distinguish three levels of visibility for an environment:

public: All users can see those environments. Only administrators can tag them this way.

shared: Every users can see the environment provided they use the -u option to specify the user the environment belongs to.

private: The environment is only visible by the user the environment belongs to.

For example, a shared environment added by user user is listed this way:

frontend:

kaenv3 -l -u user

You can also look for a specific version with the --env-version option. All the versions of the environments can be found in /grid5000/images. The version number is the last part of the tgz file.

For example: debian10-x64-min-2019100414.tgz => it's the min debian10-x64 reference environment version 2019100414.

Being able to reproduce the experiments that are done is a desirable feature. Therefore, you should always try to control as much as possible the environment the experiment is done in. Therefore, we will attempt to check that the environment that was chosen in the environment directory is the one available on a given cluster. On the cluster you would like to deploy, type the following command to print information about an environment:

frontend:

kaenv3 -p debian10-x64-base -u deploy

You must specify the user option. In our case, all public environments belong to user deploy.

In theory, you should also check the post-install script. A post-install script adapts an environment to the site it is deployed on.

If everything seems ok, please proceed to the next step.

Make a job on a deployable node

By default, Grid'5000 nodes are running on the production environment, which already contains most of the important features and can be used to run experiments. But you will not have administrative privileges (root privileges) on these nodes. So you will not be able to customize these environments at will. In fact, only reference environments can be customized at will. But to have the right to deploy a reference environment on a node, you must supply the option -t deploy when submitting your job.

For this part of the tutorial, job made will be interactive (-I), of the deploy type (-t deploy), on only one machine (-l nodes=1) to do environment customization (we will give ourselves 3 hours with -l walltime=3), which gives us the following command, that will open a new shell session on the frontend node:

frontend:

oarsub -I -t deploy -l nodes=1,walltime=3

Since all Grid'5000 nodes do not necessary have console access, it is recommended in the context of this tutorial to add the option rconsole="YES" to your reservation command.

frontend:

oarsub -I -t deploy -l {"rconsole='YES'"}/nodes=1,walltime=3

Indeed, when you submit a job of the deploy type, a new shell is opened on the frontend node and not on the first machine of the job as for standard jobs. When you exit from this shell, the job ends. The shell is populated with OAR_* environment variables. You should look at the list of available variables to get an idea of the information you can use to script deployment later. As usual, if the job is successfull, you will get the name of the machine allocated to your job with:

frontend:

cat $OAR_FILE_NODES

	Warning
	At the end of a reservation with the options `-t deploy`, the reserved nodes will be restarted to boot on the standard environment and thus be available to any other user. So you should only use this option `-t deploy` when you actually intend to deploy a reference environment on the reserved nodes.

Deploy a reference environment

To deploy your environment, you must discover the nodes you were allocated by OAR. The simplest way of doing this is to look at the content of the file whose name is stored in $OAR_FILE_NODES (this variable is labelled $OAR_NODE_FILE too) or the messages displayed when the job was made. This variable $OAR_NODE_FILE simply stores the url of the file containing the FQDN of all your reserved nodes. Deployment happens when you run the following command:

frontend:

kadeploy3 -e debian10-x64-base -m node.site.grid5000.fr

You can automate this to deploy on all nodes of your job with the -f option:

frontend:

kadeploy3 -e debian10-x64-base -f $OAR_FILE_NODES

In order to be able to connect to the node (as root), you must use the -k option and proceed by two ways:

You can either specify the public key that will be copied in /root/.ssh/authorized_keys on the deployed nodes:

frontend:

kadeploy3 -e debian10-x64-base -f $OAR_FILE_NODES -k ~/.ssh/my_special_key.pub

Or you can supply the -k option without argument. This will automatically copy your ~/.ssh/authorized_keys and replace the /root/.ssh/authorized_keys file on the deployed nodes.

frontend:

kadeploy3 -e debian10-x64-base -f $OAR_FILE_NODES -k

The second case is actually the simplest way. One of its advantages is that after deployments, you will be able to connect directly from your local computer to the deployed nodes, the same way you connect to the frontend of the site were those nodes are.
Once kadeploy has run successfully, the allocated node is deployed under debian10-x64-base environment. It will then be possible to tune this environment according to your needs.

	Note
	It is not necessary here, but you can specify destination partition with the -p option

Connect to the deployed environment and customize it

1. Connection

On reference environments managed by the staff, you can use root account for login through ssh (kadeploy checks that sshd is running before declaring a deployment successful). To connect to the node type:

frontend:

ssh root@node.site.grid5000.fr

In case this doesn't work, please take a look at the kadeploy section of the Sidebar > FAQ

2. Adding software to an environment

Where you learn to install software using the package repository of your distribution on Grid'5000

You can therefore update your environment (to add missing libraries that you need, or remove packages that you don't so that sizes down the image and speeds up the deployment process, etc.) using:

node:

apt-get update
apt-get upgrade
apt-get install list of desired packages and libraries
apt-get --purge remove list of unwanted packages
apt-get clean

Create a new environment from a customized environment

We now need to save this customized environment, where you have a user account, to be able to use this account again each time you deploy it.
The first step to create an environment is to create an archive of the node you just customized. Because of the various implementations of the /dev filesystem tree, this can be a more or less complex operation.

1. Use the provided tools

You can use tgz-g5k to extract a Grid'5000 environment tarball from a running node.

frontend:

tgz-g5k -m node -f ~/path_to_myimage.tgz

This will create a file path_to_image.tgz into your home directory on frontend.

Note

Please consider the following:

If you want to extract a tarball from the Grid'5000 standard environment (i.e., a non-deployed job), you will need to add the option -o to use oarsh/oarcp instead of ssh/scp
If you want tgz-g5k to access the node with a custom user id, you can use the option -u myCustomeId (default is root)
You can find more information on tgz-g5k (e.g., available options, command line examples) by executing tgz-g5k -h. Some implementation details are also available on the man page (man tgz-g5k).

2. Describe the newly created environment for deployments

Kadeploy3 works using an environment description. The easiest way to create a description for your new environment is to change the description of the environment it is based on. We have based this tutorial on the debian10-x64-base environment of user deploy. We therefore print its description to a file that will be used as a good basis:

frontend:

kaenv3 -p debian10-x64-base -u deploy > mydebian10-x64-base.env

It should be edited to change the name, description, author lines, as well as the tarball line. Since the tarball is local, the path should be a simple absolute path (without a leading server://). The visibility line should be removed, or changed to shared or private. Once this is done, the newly created environment can be deployed using:

frontend:

kadeploy3 -f $OAR_NODEFILE -a mydebian10-x64-base.env

This kind of deployment is called anonymous deployment because the description is not recorded into the Kadeploy3 database. It is particularly useful when you perform the tuning of your environment if you have to update the environment tarball several times.

Once your customized environment is successfully tuned, you can save it to Kadeploy3 database so that you can directly deploy it with kadeploy3, by specifying its name:

frontend:

kaenv3 -a mydebian10-x64-base.env

and then (if your environment is named "mydebian10-base"):

frontend:

kadeploy3 -f $OAR_NODEFILE -e mydebian10-base

With kaenv3 command, you can manage your environments at your ease. Please refer to its documentation for an overview of its features.

Deployment options

Multisite deployment

In order to achieve a deployment on nodes from differents sites, you can use the multiserver option of kadeploy, using the -M

frontend:

kadeploy3 -M -f file_with_all_nodes -e debian10-x64-std

Deploy on other partition of disk(s)

Kadeploy default handling of partitions may be too limited for some usage. One may need to use disks differently (e.g. to deploy our environment in an optimal way). Kadeploy offers several options to simply deploy on another existing partition (sda2 or sda5), or if required to repartition disks entirely and/or use several disks (on nodes with many disks).

Deploy on sda2 or sda5

First, as this kind of deployment will break node standard operation, you must tell to OAR that it should be redeployed entirely after the reservation with the -t destructive option:

frontend:

oarsub -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I

Then you can deploy on sda2 or sda5 with the -p 2 or -p 5 option:

frontend:

kadeploy3 -e debian10-x64-nfs -f $OAR_NODEFILE -p 2 -k

Deploy on additional disks

	Warning
	Currently broken

First, as this kind of deployment will break node standard operation, you must tell to OAR that it should be redeployed entirely after the reservation with the -t destructive option:

frontend:

oarsub -t deploy -t destructive -l nodes=1,walltime=1 -p "cluster='hercule'" -I

Then you can deploy on an additional disk such as sdb with the -b sdb option:

frontend:

kadeploy3 -e debian10-x64-base -f $OAR_NODEFILE -b sdb -k

Disks can also be handled differently by modifying the deployment automata, see below.

Tuning the Kadeploy3 deployment workflow

kadeploy3 allows to fully modify the deployment workflow.

First of all you have to understand the different steps of a deployment. There are 3 macro-steps:

SetDeploymentEnv: this step aims at setting up the deployment environment that contains all the required tools to perform a deployment ;
BroadcastEnv: this step aims at broadcasting the new environment to the nodes and writing it to disk;
BootNewEnv: this step aims at rebooting the nodes on their new environment.

kadeploy3 provides several implementations for each of those 3 macro-steps. You can consult that list in the kadeploy3 page. In Grid'5000, we use the following steps by default in all our clusters:

SetDeploymentEnv -> SetDeploymentEnvUntrusted: use an embedded deployment environment
BroadcastEnv -> BroadcastEnvKascade: use the Kascade tool to broadcast the environment
BootNewEnv -> BootNewEnvKexec: the nodes use kexec to reboot (if it fails, a BootNewEnvClassical, classical reboot, will be performed)

Each one of these implementations is divided in micro steps. You can can see the name of those micro-steps if you use the kadeploy3 option --verbose-level 4. And to see what is actually executed during those micro-steps you can add the debug option of kadeploy3 -d

frontend:

kadeploy3 -f $OAR_FILE_NODES -k -e debian10-x64-base --verbose-level 4 -d > ~/kadeploy3_steps

This command will store the kadeploy3 standard output in the file ~/kadeploy3_steps. Lets analyse its content:

frontend:

grep "Time in" ~/kadeploy3_steps

This command will print on the terminal all the micro-steps executed during the deployment process, and the time spent for each execution. Here are the micro-steps that you should see:

SetDeploymentEnvUntrusted-switch_pxe: Configures the PXE server so that this node will boot on an environment that contains all the required tools to perform the deployment,
SetDeploymentEnvUntrusted-reboot: Sends a reboot signal to the node
SetDeploymentEnvUntrusted-wait_reboot: Waits for the node to restart.
SetDeploymentEnvUntrusted-send_key_in_deploy_env: Sends kadeploy's user's ssh public key into the node's authorized_keys to ease the following ssh connections,
SetDeploymentEnvUntrusted-create_partition_table: Creates the partition table
SetDeploymentEnvUntrusted-format_deploy_part: Format the partition where your environment will be installed. This partition is by default /dev/sda3
SetDeploymentEnvUntrusted-mount_deploy_part: Mounts the deployment partition in a local directory.
SetDeploymentEnvUntrusted-format_tmp_part: Format the partition defined as tmp (by default, /dev/sda5)
SetDeploymentEnvUntrusted-format_swap_part: Format the swap partition
BroadcastEnvKascade-send_environment: Sends your environments into the node and untar it into the deployment partition.
BroadcastEnvKascade-manage_admin_post_install: Execute post installation instructions defined by the site admins, in general to adapt to the specificities of the cluster: console baud rate, Infiniband,...
BroadcastEnvKascade-manage_user_post_install: Execute user defined post installation instructions to automatically configure its node depending on its cluster, site, network capabilities, disk capabilities,...
BroadcastEnvKascade-send_key: Sends the user public ssh key(s) to the node (if the user specified it with the option -k).
BroadcastEnvKascade-install_bootloader: Properly configures the bootloader
BootNewEnvKexec-switch_pxe: Configure the PXE server so that this node will boot on the partition where your environment has been installed
BootNewEnvKexec-umount_deploy_part: Umount the deployment partition from the directory where it has been mounted during the step 7.
BootNewEnvKexec-mount_deploy_part: ReMount the deployment partition
BootNewEnvKexec-kexec: Perform a kexec reboot on the node
BootNewEnvKexec-set_vlan: Properly configure the node's VLAN
BootNewEnvKexec-wait_reboot: Wait for the node to be up.

That is it. You now know all the default micro-steps used to deploy your environments.

Adjusting timeout for some environments

Since kadeploy3 provides multiple macro-steps and micro-steps, its is important to detect when a step in failing its execution. This error detection is done by using timeout on each step. When a timeout is reached, the nodes that have not completed the given step are discarded from the deployment process.
The value of those timeouts varies from one cluster to another since they depend on the hardware configuration (network speed, hard disk speed, reboot speed, ...). All defaults timeouts are entered in the configurations files on the kadeploy3 server. But you can consult the default timeouts of each macro-steps by using the command kastat3

frontend:

kastat3 -I

 Kadeploy server configuration:
 Custom PXE boot method: PXElinux
 Automata configuration:
   hercule:
     SetDeploymentEnv: SetDeploymentEnvUntrusted,1,600
     BroadcastEnv: BroadcastEnvKascade,0,1000
     BootNewEnv: BootNewEnvKexec,0,180; BootNewEnvHardReboot,0,900
   nova:
     SetDeploymentEnv: SetDeploymentEnvUntrusted,1,600
     BroadcastEnv: BroadcastEnvKascade,0,1000
     BootNewEnv: BootNewEnvKexec,0,150; BootNewEnvHardReboot,0,600
 ...

kadeploy3 allow users to change timeouts in the command line. In some cases, when you try to deploy an environment with a large tarball or with a post-install that lasts too long, you may get discarded nodes. This false positive behavior can be avoided by manually modifying the timeouts for each step at the deployment time.

For instance, in our previous example, the timeout of each steps are:

SetDeploymentEnvUntrusted: 143
BroadcastEnvKascade: 111
BootNewEnvKexec: 33

You can increase the timeout of the second step to 1200 seconds with the following command:

frontend:

kadeploy3 -e my_big_env -f $OAR_FILE_NODES -k --force-steps "SetDeploymentEnv|SetDeploymentEnvUntrusted:1:450&BroadcastEnv|BroadcastEnvKascade:1:1200&BootNewEnv|BootNewEnvClassical:1:400"

Set Break-Point during deployment

As mentioned in the section above, a deployment is a succession of micro steps that can be consulted and modified.
Moreover, kadeploy3 allows user to set a break-point during deployment.

frontend:

kadeploy3 -f $OAR_FILE_NODES -k -e debian10-x64-base --verbose-level 4 -d --breakpoint BroadcastEnvKascade:manage_user_post_install

This command can be used for debugging purpose. It performs a deployment with the maximum verbose level and it asks to stop the deployment workflow just before executing the manage_user_post_install micro-step of the BroadcastEnvKascade macro-step. Thus you will be able to connect in the deployment environment and to manually run the user post install script to debug it.

	Warning
	At the current state of `kadeploy3`, it is not possible to resume the deployment from the break-point step. Thus you will have to redeploy you environment from the first step. This feature will be implemented in future version of `kadeploy3`.

Modify the deployment workflow with custom operations

In Kadeploy3, we can easily customize the deployment's automata. It's possible to add custom pre, post or substitute operations to each steps. In a custom operation it's possible to: send a file, execute a command or run a script.

This feature in explained in Kadeploy3's documentation (available on Kadeploy3's website) in the section 4.2.2, Use Case 10 and 4.7.

This is illustrated in the following sub-sections.

	Note
	When running a custom script, Kadeploy will export different variables, you can get a list of them by running `kadeploy3 -I`. A description of each of this variables is available in Kadeploy3's documentation (on Kadeploy3 website) in the section 4.4

Format additional disks

In this example, we will add some custom operations to the deployment workflow: our nodes have two additional hard disks and we want them to be formated during the deployment process.

We want to a new partition scheme such as:

classical grid5000 partitioning on sda
data1 ext4 on sdb1
data2 ext2 on sdc1

The three following sections describe how to perform such an operation.

1. Make the reservation in destructive mode

First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive parameter:

frontend:

oarsub -t deploy -t destructive -l nodes=1,walltime=2 -p "cluster='hercule'" -I

2. Describe the custom operations

After that you have to create a file that describe the custom operations you want to be performed during the deployment. In our example we will first repartition the additional disks (using parted) and then format them (using the script format.sh).

The operation description file (let's say custom-partitioning.yml) should look like something like this:

---
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
  # Custom partitioning step, done after the create_partition_table micro-step
  # In the sample this step is exploded in 4 steps but it can be done in 1 using a single parted command
  create_partition_table:
      post-ops:
        # We send a file on the node
        - action: send
          file: sdb.parted
          # The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy
          destination: $KADEPLOY_TMP_DIR 
          name: send_partition_map_sdb
        # Then we execute the parted command using the previously sent file
        - action: exec
          name: partitioning_sdb
          # The variable $KADEPLOY_TMP_DIR will be substitued by kadeploy
          command: parted -a optimal /dev/sdb --script $(cat $KADEPLOY_TMP_DIR/sdb.parted)
        # Same operation for the second disk
        - action: send
          file: sdc.parted
          destination: $KADEPLOY_TMP_DIR 
          name: send_partition_map_sdc
        - action: exec
          name: partitioning_sdc
          command: parted -a optimal /dev/sdc --script $(cat $KADEPLOY_TMP_DIR/sdc.parted)
  # Custom format step, done after the format_deploy_part micro-step
  format_deploy_part:
      post-ops:
        # We run the script contained in the file 'format.sh'
        - action: run 
          name: format_disks
          file: format.sh

The file sdb.parted will look like something like this:

mklabel msdos
u GB mkpart primary ext4 0% 100%
align-check optimal 1

The file sdc.parted will look like something like this:

mklabel msdos
u GB mkpart primary ext2 0% 100%
align-check optimal 1

The file format.sh will look like something like this:

#!/bin/sh
set -e
# formating /dev/sdb
mkfs -t ext4 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q /dev/sdb1
# formating /dev/sdc
mkfs -t ext2 -b 4096 -O sparse_super,filetype,resize_inode,dir_index -q /dev/sdc1

3. Run the deployment

Now you can deploy you environment with this custom operation:

frontend:

kadeploy3 -e debian10-x64-min -f $OAR_NODE_FILE -k --custom-steps ./custom-partitioning.yml

	Warning
	In some cases you should increase the step timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details.

Note: Both partitions are not mounted on boot. To mount those partitions you should do:

NODE:

mkdir -p /media/data1

NODE:

mkdir /media/data2

NODE:

mount /dev/sdb1 /media/data1

NODE:

mount /dev/sdc1 /media/data2

Use a custom partitioning scheme

Example 1: Deploy on the whole disk

In this example, we will modify the deployment workflow to deploy the system on a unique disk partition ( '/' on sda1 )

1. Make the reservation in destructive mode

As you will change partitioning of the disk, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive parameter:

frontend:

oarsub -t deploy -t destructive -l nodes=1,walltime=2 -I

2. Describe the custom operations

After that you have to create a file that describe the custom operations you want to be performed during the deployment. In this example we will create our custom partitioning scheme and bypass some steps that are not necessary to deploy the system on a unique partition.

The operation description file (let's say custom-partitioning.yml) should look like something like this:

---
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
  # Custom partitioning step that is substituted to the create_partition_table micro-step
  create_partition_table:
    substitute:
      # We send a file on the node
      - action: send
        file: map.parted
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        destination: $KADEPLOY_TMP_DIR
        name: send_partition_map
      # Then we execute the parted command using the previously sent file
      - action: exec
        name: partitioning
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
# Hack to disable useless steps
  format_tmp_part:
    substitute:
      - action: exec
        name: remove_format_tmp_part_step
        command: /bin/true
  format_swap_part:
    substitute:
      - action: exec
        name: remove_format_swap_part_step
        command: /bin/true

The file map.parted, which will be passed to parted, will look like this:

mklabel gpt
mkpart partition-system ext4 0% 100%
toggle 1 boot
align-check optimal 1

3. Customize the environment's postinstall

In order for our new partitions to be mounted at boot time we will modify the Grid'5000 postinstall files.

Create and go in your public directory:

frontend:

mkdir public/custom-postinstall && cd public/custom-postinstall

Then decompress the postinstall archive:

frontend:

tar xzf /grid5000/postinstalls/g5k-postinstall.tgz

Add your custom /etc/fstab file in this directory, named fstab:

/dev/sda1       /          ext4    defaults 1      2

When you will pass "--fstab custom" option to the postinstall, it will copy this file in /etc/fstab

Regenerate the postinstall archive:

frontend:

tar -czvf ~/public/g5k-postinstall-custom.tgz *

Make some cleanup:
Create the environment's description file (let's say custom-env.dsc) based on the reference one:
- use kaenv3 -p debian10-x64-base to have an example of environment description.

Your custom-env.dsc should look like this:

--- 
name: custom-env
version: 1
description: Custom env based on Debian 10
author: me@domain.tld
visibility: shared
destructive: true
os: linux
image:
  file: server:///grid5000/images/debian10-x64-base-2020012812.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: http://public/~<login>/g5k-postinstall-custom.tgz
  compression: gzip
  script: g5k-postinstall --net debian --fstab custom
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false

4. Run the deployment

Finally, we deploy our custom environment with your custom operations:

frontend:

kadeploy3 -a custom-env.dsc -f $OAR_NODE_FILE -p 1 -k --custom-steps custom-partitioning.yml

	Note
	In some case you should increase the step timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details.

Example 2: Deploy on multiple partitions

In this example, we will modify the deployment workflow: a different partition will be used for each of the /, /home, /opt and /tmp directories. Imagine that you want to make your own partitioning scheme like that:

Mount point	Partition	Disk space	File System
swap	/dev/sda1	2G	linux-swap
/	/dev/sda2	18G	ext4
/var	/dev/sda3	30G	ext4
/opt	/dev/sda4	20G	ext4
/tmp	/dev/sda5	everything else	ext4

The four following sections describe how to perform such an operation.

1. Make the reservation in destructive mode

First of all, when you do your reservation, you must tell to OAR that it should redeploy the node entirely after the reservation with the -t destructive parameter:

frontend:

oarsub -t deploy -t destructive -l nodes=1,walltime=2 -I

2. Describe the custom operations

After that you have to create a file that describe the custom operations you want to be performed during the deployment. In our example we will first create apply our custom partitioning scheme, format the partition and the mount them.

The operation description file (let's say custom-partitioning.yml) should look like something like this:

---
# Our custom steps should be performed during the SetDeploymentEnv macro-step
SetDeploymentEnvUntrusted:
  # Custom partitioning step that is substituted to the create_partition_table micro-step
  create_partition_table:
    substitute:
      # We send a file on the node
      - action: send
        file: map.parted
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        destination: $KADEPLOY_TMP_DIR 
        name: send_partition_map
      # Then we execute the parted command using the previously sent file
      - action: exec
        name: partitioning
        # The variable $KADEPLOY_TMP_DIR will be substituted by Kadeploy
        command: parted -a optimal /dev/sda --script $(cat $KADEPLOY_TMP_DIR/map.parted)
  # Custom format step, done after the format_deploy_part micro-step
  format_deploy_part:
    post-ops:
      # We run the script contained in the file 'format.sh'
      - action: run 
        name: format_partitions
        file: format.sh
  # Custom mount step, done after the mount_deploy_part micro-step
  mount_deploy_part:
    post-ops:
      # We run the script contained in the file 'format.sh'
      - action: run 
        name: mount_partitions
        file: mount.sh
# Hack to disable useless steps
  format_tmp_part:
    substitute:
      - action: exec
        name: remove_format_tmp_part_step
        command: /bin/true
  format_swap_part:
    substitute:
      - action: exec
        name: remove_format_swap_part_step
        command: /bin/true

	Note
	In order for Kadeploy to be able to perform the installation correctly, every partitions have to be mounted before the installation process which is done in the macro-step BroadcastEnv

The file map.parted will look like something like this:

mklabel gpt
u GB mkpart partition-swap linux-swap 0% 2
u GB mkpart partition-system ext4 2 20
u GB mkpart partition-var ext4 20 50
u GB mkpart partition-opt ext4 50 70
u GB mkpart partition-tmp ext4 70 100%
toggle 2 boot
align-check optimal 1
align-check optimal 2
align-check optimal 3
align-check optimal 4
align-check optimal 5

The file format.sh will look like something like this:

#!/bin/sh
set -e

mkfs_opts="sparse_super,filetype,resize_inode,dir_index"
ext4_blocksize="4096"

# create swap
mkswap ${KADEPLOY_BLOCK_DEVICE}1
# / will be formated by Kadeploy since we will precise the -p 2 option
# formating /var
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}3
# formating /opt
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}4
# formating /tmp
mkfs -t ext4 -b ${ext4_blocksize} -O ${mkfs_opts} -q ${KADEPLOY_BLOCK_DEVICE}5

	Note
	When running a custom script, Kadeploy will export different variables, you can get a list of them by running "kadeploy -i".

The file mount.sh will look like something like this:

#!/bin/sh
set -e

# / will be mounted in ${KADEPLOY_ENV_EXTRACTION_DIR} by Kadeploy
# mount /var
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/var
mount ${KADEPLOY_BLOCK_DEVICE}3 ${KADEPLOY_ENV_EXTRACTION_DIR}/var/
# mount /opt
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/opt
mount ${KADEPLOY_BLOCK_DEVICE}4 ${KADEPLOY_ENV_EXTRACTION_DIR}/opt/
# mount /tmp
mkdir ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp
mount ${KADEPLOY_BLOCK_DEVICE}5 ${KADEPLOY_ENV_EXTRACTION_DIR}/tmp/

3. Customize the environment's postinstall

In order for our new partitions to be mounted at boot time we can modify the Grid'5000 postinstall files (this customization can also be done by adding another custom operation).

Create and go in a temporary directory:

frontend:

tmpdir=$(mktemp -d) && export tmpdir && pushd $tmpdir

Then decompress the postinstall archive:

frontend:

tar xzf /grid5000/postinstalls/g5k-postinstall.tgz

	Note
	We assume that the current shell is BASH, if not please replace the "export" instruction

Add your custom /etc/fstab file in this temporary directory, named fstab:

/dev/sda1       none          swap    sw       0      0
/dev/sda3       /var          ext4    defaults 1      2
/dev/sda4       /opt          ext4    defaults 1      2
/dev/sda5       /tmp          ext4    defaults 1      2

/ will be added by Kadeploy since we will precise the -p 2 option

Regenerate the postinstall archive:

frontend:

tar -czvf ~/g5k-postinstall-custom.tgz *

Make some cleanup:

frontend:

popd && rm -R $tmpdir

Create the environment's description file (let's say custom-env.yml) based on the reference one:

frontend:

kaenv3 -p debian10-x64-base -u deploy | sed -e "s/archive:.*$/archive: \/home\/${USER}\/g5k-postinstall-custom.tgz/" -e 's/public/shared/' > custom-env.yml

and customize the custom-env.yml file to suit your needs (especially your archive path):

--- 
name: custom-env
version: 1
description: Custom env based on Debian 10
author: me@domain.tld
visibility: shared
destructive: true
os: linux
image:
  file: server:///grid5000/images/debian10-x64-base-2019100414.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: /home/me/g5k-postinstall-custom.tgz
  compression: gzip
  script: g5k-postinstall --net debian --fstab custom
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false

	Warning
	Do not forget the `--fstab custom` option to g5k-postinstall.

4. Run the deployment

Finally, we deploy our custom environment with your custom operations:

frontend:

kadeploy3 -a custom-env.yml -f $OAR_NODE_FILE -p 2 -k --custom-steps custom-partitioning.yml

	Note
	In some case you should increase the step timeout (for some long formatting for example) see Advanced_Kadeploy#Adjusting timeout for some environments for details.

Customizing the postinstalls

In Kadeploy3, postinstalls are scripts that are executed after the copy of the image file in order to customize site-specific or cluster-specific aspects. Since the beginning on 2018, on Grid'5000 the same postinstall script (called g5k-postinstall) is used for all reference environments (and is thus compatible with all supported Debian versions and distributions). That script takes parameters in order to define its behaviour (for example, to choose the style of network configuration to use).

Using g5k-postinstall

The source code for g5k-postinstall is available on gitlab.inria.fr. Its parameters at the time of writing are:

frontend:

g5k-postinstall --help

Usage: g5k-postinstall [options]

Options:
    -d, --debug                      Run in debug mode, with output to terminal
    -n, --net n1,n2,n3               Network configuration specification
    -f, --fstab f1,f2,f3             Filesystems configuration specification
    -r, --restrict-user MODE         User restriction mode
        --inittab PATTERN            Configure console in inittab.
        --no-ref-api                 Do not use the Reference API
        --disable-hacks h1,h2,h3     Hacks to disable

Valid tags for network specification:
  debian               write config in /etc/network/interfaces
  debian-bridged       write config in /etc/network/interfaces, with a bridge
                       setup (for the std env)
  netplan              write config in /etc/netplan/01-netcfg.yaml
                       (https://wiki.ubuntu.com/Netplan)
  redhat               write config in /etc/sysconfig/network-scripts/*
  traditional-names    use traditional NIC naming (e.g. eth0) instead of
                       predictable
  force-ref-api-names  force the use of the name provided in the reference API
                       (by default, the predictable name determined by the
                       kernel is used)
  hpc                  add support for HPC (eg InfiniBand) interfaces
Example: --net debian-bridged,traditional-names,hpc

Valid tags for filesystems configuration in fstab:
  nfs            include generic NFS mounts (/home with autofs, /grid5000)
  no-autofs      do not use autofs, just mount the user's NFS directory
  site-specific  include site-specific NFS mounts (e.g. /softs)
  custom         include custom mounts for custom partitionning. Need fstab
                 file included in postinstall archive.
Example: --fstab nfs,site-specific


Valid modes for user restriction:
  std         if deployed on production partition, restrict to root,oar. else,
              restrict to the current user (see below)
  current     restrict to root and the user having currently reserved the node
  login:jdoe  restrict to a specific login (e.g. jdoe)
  none        no restriction (DEFAULT)

Inittab option:
  Needed for non-systemd systems.
  Example of pattern: s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100
  Where SPEED and TTYSX are replaced by g5k-postinstall using information
  retieved from the parameters which are passed in /proc/cmdline.

No reference API option:
  Do not use the Reference API. This is useful during initial configuration of
  new clusters.

Disable hacks option:
  g5k-postinstall includes hacks that can optionally be disabled. Current
  hacks are:
  - oot-i40e          install the i40e driver on chifflot and chiclet (lille)
  - beegfs-gr520      configure beegfs shares on grcinq and grvingt (nancy)
Example: --disable-hacks oot-i40e

An example environment description using g5k-postinstall is:

---
name: debian10-x64-min
version: 2019100414
description: debian 10 (buster) - min
author: support-staff@list.grid5000.fr
visibility: public
destructive: false
os: linux
image:
  file: server:///grid5000/images/debian10-x64-min-2019100414.tgz
  kind: tar
  compression: gzip
postinstalls:
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
  compression: gzip
  script: g5k-postinstall --net debian
boot:
  kernel: "/vmlinuz"
  initrd: "/initrd.img"
filesystem: ext4
partition_type: 131
multipart: false

Things that you can do from there:

Use different parameters to change the behaviour of the postinstall. Example parameters for various situations are:
- Debian min environment with traditional NIC naming: g5k-postinstall --net debian --net traditional-names
- Debian min environment with predictable NIC naming: g5k-postinstall --net debian
- Debian NFS environment (mount /home, setup LDAP, restrict login to user who reserved the node): g5k-postinstall --net debian --fstab nfs --restrict-user current
- Debian big environment (NFS + setup HPC networks and mount site-specific directories): g5k-postinstall --net debian --net traditional-names --net hpc --fstab nfs --fstab site-specific
- RHEL/Centos style for network configuration: g5k-postinstall --net redhat --net traditional-names
- Ubuntu 1710 or later: NetPlan for network configuration: g5k-postinstall --net netplan
- Do not do any network configuration (useful for Gentoo), but force serial console settings: g5k-postinstall --inittab='s0:12345:respawn:/sbin/agetty -L SPEED TTYSX vt100'
Use a customized version of g5k-postinstall: after building a modified g5k-postinstall, just point the postinstalls/archive/ field to the new tar archive. See README.md and TechTeam:Postinstalls for details on g5k-postinstall internals.
Add additional postinstall to execute after g5k-postinstall. That way, the additionnal postinstall can be written in any language (e.g. just a shell script) and complement what is already done by g5k-postinstall, see below.

Adding an extra postinstall

Modifying g5k-postinstall may for some purpose not be relevant (or overkill), for instance if actions are independent and can be executed after g5k-postinstall is run. In such a case, it is very easy to provide an additional postinstall that for instance may just be a shell script:

Assuming additional-postinstall.sh is a script located at the root of the /home/jdoe/public/debiantesting-x64-additional-postinstall.tar.gz archive, we just have to declare the additional postinstall in the environment description as follows:

---
author: John Doe
boot:
  initrd: /initrd.img
  kernel: /vmlinuz
description: debian testing with some customizations
destructive: false
filesystem: ext4
image:
  compression: gzip
  file: local:///home/jdoe/public/debiantesting-x64-custom.tar.gz
  kind: tar
multipart: false
name: debiantesting-x64-custom
os: linux
partition_type: 131
postinstalls:
- archive: server:///grid5000/postinstalls/g5k-postinstall.tgz
  compression: gzip
  script: g5k-postinstall --net debian --fstab nfs --restrict-user current
- archive: local:///home/jdoe/public//debiantesting-x64-additional-postinstall.tar.gz
  compression: gzip
  script: additional-postinstall.sh
version: 2020071009
visibility: shared

Both postinstalls will execute during the deployment, the one after the other.

Advanced Kadeploy: Difference between revisions

Revision as of 10:39, 24 September 2020

Contents

What you need to know before starting

What is an Environment?

How can I make my own environment?

Disk partitioning

Search and deploy an existing environment

Search an environment

Make a job on a deployable node

Deploy a reference environment

Connect to the deployed environment and customize it

Create a new environment from a customized environment

Deployment options

Multisite deployment

Deploy on other partition of disk(s)

Deploy on sda2 or sda5

Deploy on additional disks

Tuning the Kadeploy3 deployment workflow

Adjusting timeout for some environments

Set Break-Point during deployment

Modify the deployment workflow with custom operations

Format additional disks

Use a custom partitioning scheme

Example 1: Deploy on the whole disk

Example 2: Deploy on multiple partitions

Customizing the postinstalls

Using g5k-postinstall

Adding an extra postinstall

Navigation menu