Submitit

From Grid5000
Jump to navigation Jump to search

Submitit

Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster. It basically wraps submission and provide access to results, logs and more. Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Submitit allows to switch seamlessly between executing on Slurm or locally. Development is in progress for an OAR plugin, to facilitate the passage between OAR and Slurm based resource managers. Source code, issues and pull requests can be found here.

Basic usage

Submitit installation

pip can be used to install the stable release of submitit:

Terminal.png flille:
pip install submitit

otherwise, conda can be used to install submitit from the conda-forge:

Terminal.png flille:
conda install -c conda-forge submitit

an installation from Source can also be used to get the latest version on the main branch:

Performing an addition with Submitit

Here is a Python script example which allows to execute an addition job on Slurm, OAR or locally.

import submitit
from operator import truediv

def add(a, b):
    return a + b

# logs are dumped in the folder
executor = submitit.AutoExecutor(folder="log_test")

job_addition = executor.submit(add, 5, 7)  # will compute add(5, 7)
print('job_addition: ', job_addition)  # ID of your job
output = job_addition.result()  # waits for completion and returns output
print('job_addition output: ', output)
assert output == 12  # 5 + 7 = 12...  your addition was computed on the cluster

The example script can be launched on frontend as follow:

Terminal.png flille:
python3 this-script.py

Advanced usage