GNU Parallel: Difference between revisions

From Grid5000
Jump to navigation Jump to search
Line 56: Line 56:


=== Exemple 1: illustrating the execution of the GNU Parallel tasks in a gantt chart ===
=== Exemple 1: illustrating the execution of the GNU Parallel tasks in a gantt chart ===
In this example we will build a gantt diagram showing a timeline of execution of GNU Parallel tasks, within a OAR job.
In this example we use GNU Parallel and build a gantt diagram showing a timeline of execution of the GNU Parallel tasks, within a OAR job.


Here is an example of such a diagram. It illustrates the GNU Parallel tasks dispatch on the resources which were reserved previously in a OAR job. This somehow completes the [[Status#Resources_reservations_.28OAR.29_status|OAR drawgantt]] display, by showing what is happening inside the OAR job.
Below is an example of such a diagram. It illustrates the GNU Parallel tasks dispatch on the resources which were reserved previously in a OAR job. This somehow completes the [[Status#Resources_reservations_.28OAR.29_status|OAR drawgantt]] display, by showing what is happening inside the OAR job.


[[File:Parallel-gantt.png|1200px]]
[[File:Parallel-gantt.png|1200px]]


Please find below the steps to build such a diagram.
Please find below how GNU Parallel is used in this cas, and the steps to build the diagram.


; Initiate your workspace
; Initiate your workspace
Line 96: Line 96:


; Create the html/javascript code to render the gantt chart of the tasks in "''ganttchart.html''"
; Create the html/javascript code to render the gantt chart of the tasks in "''ganttchart.html''"
The ''parallel.log'' file is generated while GNU Parallel is executing the tasks. We create the ''ganttchart.html'' web page to render the gratt chart from it.
The ''parallel.log'' file is generated while GNU Parallel is executing the tasks. We create the ''ganttchart.html'' web page to render the gantt chart from it.
<syntaxhighlight lang="html" line='line'>
<syntaxhighlight lang="html" line='line'>
<html>
<html>
Line 143: Line 143:
oarsub -n OAR2parallel -l nodes=8 -p "cluster='ecotype'" "./oarjob.sh 80"
oarsub -n OAR2parallel -l nodes=8 -p "cluster='ecotype'" "./oarjob.sh 80"
</syntaxhighlight>
</syntaxhighlight>
80 tasks to run: roughly 10 tasks per node, 5 per CPUs (bi-CPU nodes), but let's '''confront with to the actual task dispatch in the gantt chart below !'''  
80 tasks to run: roughly 10 tasks per node, 5 per CPUs (bi-CPU nodes), but see in the gantt chart the actual tasks dispatch !'''  


; Open the ganttchart.html web page
; Open the ganttchart.html web page
Line 153: Line 153:
The gantt chart will show up and refresh every 3 seconds while GNU Parallel executes tasks. This should be very alike the diagram shown at the beginning of this section.
The gantt chart will show up and refresh every 3 seconds while GNU Parallel executes tasks. This should be very alike the diagram shown at the beginning of this section.


; Show the progress
; Bonus: show the progress
GNU Parallel progress (''--bar'' option) is written to the OAR job stderr file, we can look at it in a loop:
GNU Parallel progress (''--bar'' option) is written to the OAR job stderr file, we can look at it in a loop:
<syntaxhighlight lang="bash" line='line'>
<syntaxhighlight lang="bash" line='line'>

Revision as of 10:22, 7 April 2020

This page descibes the use of GNU Parallel on Grid'5000.

Quoting GNU Parallel website:

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run  for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel. 

For a more general and complete information, see the GNU Parallel website.

We details in this page Grid'5000 specific information in order to let you take benefit from the tool on the platform.

About the GNU Parallel version installed in Grid'5000

The version of GNU Parallel installed on Grid'5000 nodes comes from the Debian's official packaging.

It is a rather old version, but it seems sufficient.

Whenever one would need a more recent version, one can get the tarball provided at http://ftp.gnu.org/gnu/parallel/parallel-latest.tar.bz2, and install it in one's home directoy. This is straightforward (e.g. ./configure --prefix=$HOME/parallel && make install).

(An environment module could be provided if requested by some users.)

Benefit from using GNU Parallel in Grid5000

While OAR is the Resource and Job Management System of Grid'5000 and supports the management of batch of jobs, its use may be overkill to handle SPMD parallel executions of small tasks within a larger reservation. In concrete terms, a user may create a first OAR job in order to book a large set of resources for some time (e.g. for the night), and then have to submit a batch of many small tasks (e.g. each using only one core) within that first job.

To that purpose, using OAR container for the first job, then OAR inner jobs for the small tasks is overkill. (But note that using OAR container and inner jobs makes sense when all jobs are not from the same user, for tutorials for instance).

We strongly advise to use GNU parallel to handle the execution of the small tasks within the initial OAR reservation of resources. That means only create one OAR job to book the large set of resources (not using the container job type), then within this job, use GNU Parallel.

Note that using GNU Parallel to handle the small tasks, the OAR restriction (e.g. max 200 jobs in queue) will not apply.

How to use GNU Parallel in Grid'5000

GNU Parallel must be used within a OAR job: GNU Parallel does not book resources, it just manages the concurrent/parallel executions of tasks on already reserved resources.

Single node

Within a OAR job of only 1 node (host), there is nothing specific to Grid'5000 to know in the usage of GNU parallel, in order to exploit all the cores of the node. Just run the parallel command in the job shell. See GNU parallel documentation, or manual page for more information.

Multiple nodes

Within a OAR job of many nodes (hosts), the user needs to tell GNU Parallel how to remotely execute the tasks on the nodes reserved in the OAR job, from the head node (where the parallel command is to be run).

  1. It has provide the list of target nodes to execute on, to pass to the GNU Parallel --slf option. It is possible to use the OAR node file $OAR_NODEFILE. Note that this file contains as many lines with a node name as the count of cores of that node. Having these duplicates is note an issue for GNU Parallel as it preprocess this file with a uniq.
  2. It has to use the oarsh connector (unless the -t allow_classic_ssh OAR job type was used), by passing it to the GNU Parallel --ssh option.

Typical usage and some examples

Typical coupling of GNU Parallel and OAR in a multi-node reservation

Create a OAR job of 10 nodes
Terminal.png frontend:
oarsub -I -l host=10

We create an interactive job for this example, so that the command below are executed in the opened job shell.

But Note that all this can be scripted and passed to the oarsub command of a non-interactive OAR job.

Run parallel with the --ssh and --sshlloginfile options
Terminal.png node:
parallel --progress --ssh oarsh --sshloginfile $OAR_NODEFILE command ::: arguments
Note.png Note

GNU Parallel has many, many, many features to pass command and arguments, and provides as many options. The examples below will give some hints, but please refer to the many GNU Parallel documentations, examples and tutorials on the web site for a complete information.

Exemple 1: illustrating the execution of the GNU Parallel tasks in a gantt chart

In this example we use GNU Parallel and build a gantt diagram showing a timeline of execution of the GNU Parallel tasks, within a OAR job.

Below is an example of such a diagram. It illustrates the GNU Parallel tasks dispatch on the resources which were reserved previously in a OAR job. This somehow completes the OAR drawgantt display, by showing what is happening inside the OAR job.

Parallel-gantt.png

Please find below how GNU Parallel is used in this cas, and the steps to build the diagram.

Initiate your workspace

Our workspace is in the nantes frontend (any other site could do as well), in our ~/public directory, because we will open the gantt chart in the web browser at the end.

mkdir -p ~/public/parallel
cd ~/public/parallel
Create the GNU Parallel task script "task.sh"

This is where the actual user program (e.g. numerical simulation, ...) is to be placed. Here for the example, we just basically sleep for a random time, from 4 to 25s. Parameter of the user program ($@) will be passed by GNU Parallel.

#!/bin/bash
DURATION=$(($RANDOM % 22 + 4))
echo Runing "$0 with params $@ on $(hostname)... (task will run for $DURATION s)."
sleep $DURATION

Here, for the example, we just basically sleep for a random time, from 4 to 25s.

Create the OAR script "oarjob.sh"
#!/bin/bash
[ -z "$OAR_NODEFILE" ] && echo "Error: Not in a OAR job" 1>&2 && exit 1
cd ${0%/*}
TASK_COUNT=${1:-10}
seq $TASK_COUNT | parallel --joblog $PWD/parallel.log --bar --ssh oarsh --sshloginfile $OAR_NODEFILE --use-cpus-instead-of-cores $PWD/task.sh

Here we simply pass to parallel a list of parameters in its standard input, using the seq command. Thus, parameters are intergers from 1 to TASK_COUNT: TASK_COUNT tasks will be executed, each with as parameter one of those intergers. See GNU Parallel documentation for more details and other options.

We used the --use-cpus-instead-of-cores option of GNU Parallel, so that it will only run 2 tasks at a time on each node.

The script will start the GNU Parallel command from the head node of the OAR job, and generate the "parallel.log" file.

Create the html/javascript code to render the gantt chart of the tasks in "ganttchart.html"

The parallel.log file is generated while GNU Parallel is executing the tasks. We create the ganttchart.html web page to render the gantt chart from it.

<html>
  <head>
    <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script>
    <script type="text/javascript">
      google.charts.load("current", {packages:["timeline"]});
      google.charts.setOnLoadCallback(getLogs);
      function getLogs(){
        var request = new XMLHttpRequest();
        request.open('GET', './parallel.log?_=' + new Date().getTime(), true); //force cache miss
        request.responseType = 'text';
        request.send(null);
        request.onreadystatechange = function () {
          if (request.readyState === 4 && request.status === 200) {
            var lines = request.responseText.split('\n').filter(function(line) { return line.length > 0; });
            lines.shift();
            var data = lines.map(function(line) {
              a = line.split('\t');
              return [ a[1], "Task "+a[0], new Date(a[2]*1000), new Date(a[2]*1000+a[3]*1000) ];
            });
            data.sort((a, b) => (a[0] > b[0]) ? 1 : ((a[0] < b[0]) ? -1 : ((a[1] > b[1]) ? 1 : -1)));
            var container = document.getElementById('gantt');
            var chart = new google.visualization.Timeline(container);
            var dataTable = new google.visualization.DataTable();
            dataTable.addColumn({ type: 'string', id: 'Ressource' });
            dataTable.addColumn({ type: 'string', id: 'JobId' });
            dataTable.addColumn({ type: 'date', id: 'Start' });
            dataTable.addColumn({ type: 'date', id: 'End' });
            dataTable.addRows(data);
            chart.draw(dataTable);
            setTimeout(getLogs, 3000);            
          }
        }
      }
    </script>
  </head>
  <body>
    <div id="gantt" style="height: 100%;"></div>
  </body>
</html>
Submit the OAR job named OAR2parallel requesting 8 nodes of the ecotype cluster, and executing the oarjob.sh script.
oarsub -n OAR2parallel -l nodes=8 -p "cluster='ecotype'" "./oarjob.sh 80"

80 tasks to run: roughly 10 tasks per node, 5 per CPUs (bi-CPU nodes), but see in the gantt chart the actual tasks dispatch !

Open the ganttchart.html web page

Open the web page in a web browser on your workstation:

(Mind replacing YOUR_LOGIN by your actual Grid'5000 user login).

The gantt chart will show up and refresh every 3 seconds while GNU Parallel executes tasks. This should be very alike the diagram shown at the beginning of this section.

Bonus
show the progress

GNU Parallel progress (--bar option) is written to the OAR job stderr file, we can look at it in a loop:

OAR_JOB_ID=$(oarstat -u | grep OAR2parallel | cut -f 1 -d\ ) \
while oarstat -u | grep -q $OAR_JOB_ID; do
  sleep 1 && [ -r OAR.OAR2parallel.$OAR_JOB_ID.stderr ] && cat OAR.OAR2parallel.$OAR_JOB_ID.stderr
done
28% 23:57=47s 39

Wait until it reaches 100% and the OAR job finishes.

Exemple 2: Finding the best kernel parameters of a SVM algorithm

Let's consider here a ML problem in which we want to find the best kernel parameters of a SVM algorithm. To do that, we need first to build several training models configured with different kernel parameter values and later compute precision and recall for all such models to find the best one. On this example, we only focus on building all the training models.

We consider a python algorithm training.py that takes as argument the two kernel parameters for the SVM problem: --gamma and --c.

We want to generate the SVM models for:

  • gamma values ranging in [0.1, 0.01, 0.001, 0.0001]
  • C value ranging nm [1, 10, 100, 1000]

The number of trainings corresponds to the cross-product of C values and gamma values. Considering the previous ranges, we need to build 16 SVM models. We can build the 16 SVM models iteratively but to speed up the process, especially for big data sets, it is better to parallelize and distribute the computation on multiple nodes.

As our SVM algorithm is mono-threaded and the nodes we want to use have 4 cores, we need at lease for 4 nodes to distribute the 16 computations. To do so, we first make an OAR reservation to book the 4 nodes and then we execute the following command to start the 4 computations on each 4 nodes:

Terminal.png frontend:
oarsub -I -l nodes=4
Terminal.png node-1:
parallel --ssh oarsh --sshloginfile $OAR_FILE_NODES ./training.py --gamma {1} --c {2} ::: 0.1 0.01 0.001 0.0001 ::: 1 10 100 1000

As a result, this command line will execute (the order may be different):

  • on node 1 ./training.py --gamma 0.1 --c 1
  • on node 1 ./training.py --gamma 0.1 --c 10
  • on node 1 ./training.py --gamma 0.1 --c 100
  • on node 1 ./training.py --gamma 0.1 --c 1000
  • on node 2 ./training.py --gamma 0.01 --c 1
  • on node 2 ./training.py --gamma 0.01 --c 10
  • on node 2 ./training.py --gamma 0.01 --c 100
  • on node 2 ./training.py --gamma 0.01 --c 1000
  • on node 3 ./training.py --gamma 0.01 --c 1
  • on node 3 ./training.py --gamma 0.001 --c 10
  • on node 3 ./training.py --gamma 0.001 --c 100
  • on node 3 ./training.py --gamma 0.001 --c 1000
  • on node 4 ./training.py --gamma 0.001 --c 1
  • on node 4 ./training.py --gamma 0.001 --c 10
  • on node 4 ./training.py --gamma 0.001 --c 100
  • on node 4 ./training.py --gamma 0.001 --c 1000