GNU Parallel

From Grid5000
Jump to navigation Jump to search

This page descibes the use of GNU Parallel on Grid'5000.

Quoting GNU Parallel website:

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run  for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel. 

For a more general and complete information, see the GNU Parallel website.

We details in this page Grid'5000 specific information in order to let you take benefit from the tool on the platform.

About the GNU Parallel version installed in Grid'5000

The version of GNU Parallel installed on Grid'5000 nodes comes from the Debian's official packaging.

It is a rather old version, but it seems sufficient.

Whenever one would need a more recent version, one can get the tarball provided at http://ftp.gnu.org/gnu/parallel/parallel-latest.tar.bz2, and install it in one's home directoy. This is straightforward (e.g. ./configure --prefix=$HOME/parallel && make install).

(An environment module could be provided if requested by some users.)

Benefit from using GNU Parallel in Grid5000

While OAR is the Resource and Job Management System of Grid'5000 and supports the management of batch of jobs, its use may be overkill to handle SPMD parallel executions of small tasks within a larger reservation. In concrete terms, a user may create a first OAR job in order to book a large set of resources for some time (e.g. for the night), and then have to submit a batch of many small tasks (e.g. each using only one core) within that first job.

To that purpose, using OAR container for the first job, then OAR inner jobs for the small tasks is overkill. (But note that using OAR container and inner jobs makes sense when all jobs are not from the same user, for tutorials for instance).

We strongly advise to use GNU parallel to handle the execution of the small tasks within the initial OAR reservation of resources. That means only create one OAR job to book the large set of resources (not using the container job type), then within this job, use GNU Parallel.

Note that using GNU Parallel to handle the small tasks, the OAR restriction (e.g. max 200 jobs in queue) will not apply.

How to use GNU Parallel in Grid'5000

GNU Parallel must be used within a OAR job: GNU Parallel does not book resources, it just manages the concurrent/parallel executions of tasks on already reserved resources.

Single node

Within a OAR job of only 1 node (host), there is nothing specific to Grid'5000 to know in the usage of GNU parallel, in order to exploit all the cores of the node. Just run the parallel command in the job shell. See GNU parallel documentation, or manual page for more information.

Multiple nodes

Within a OAR job of many nodes (hosts), the user needs to tell GNU Parallel how to remotely execute the tasks on the nodes reserved in the OAR job, from the head node (where the parallel command is to be run).

  1. It has provide the list of target nodes to execute on, to pass to the GNU Parallel --slf option.
  2. It has to use the oarsh connector (unless the -t allow_classic_ssh OAR job type was used), by passing it to the GNU Parallel --ssh option.

Examples

Example 1: typical use of GNU Parallel in a multi-node reservation

Create a OAR job of 10 nodes
Terminal.png frontend:
oarsub -I -l host=10

We create an interactive job for this example, so that the command below are executed in the opened job shell.

Create the sshlogin file for GNU Parallel from OAR_NODEFILE

The OAR node file contains as many lines with a node name as the count of cores of that node.

Terminal.png node:
uniq $OAR_NODEFILE > nodes

For a basic sshlogin file, GNU Parallel needs one line per host. It will compute by itself how many tasks to execute on each host, as a function of the hardware threads count of the host.

Run parallel with the --ssh and --slf options
Terminal.png node:
parallel --progress --ssh $(which oarsh) --slf nodes command ::: argument

Note that all this can be scripted and passed to the oarsub command of a non-interactive OAR job.

Exemple 2: illustrating the execution of the GNU Parallel tasks in a gantt chart

Our work space is in the nantes frontend, in our ~/public directory, because we will open the gantt chart in the web browser at the end.

mkdir -p ~/public/parallel
cd ~/public/parallel
Create the GNU Parallel task script "task.sh"
cat <<'EOF' > task.sh
#!/bin/bash
DURATION=$(($RANDOM % 22 + 4))
echo Runing "$0 $@ on $(hostname)... (task will run for $DURATION s)."
sleep $DURATION
EOF

The task just sleeps randomly from 4 to 25s.

Create the OAR script "oarjob.sh"
cat <<'EOF' > oarjob.sh
#!/bin/bash
[ -z "$OAR_NODEFILE" ] && echo "Error: Not in a OAR job" 1>&2 && exit 1
cd ${0%/*}
sort -u $OAR_NODEFILE > $PWD/sshloginfile
TASK_COUNT=${1:-10}
seq $TASK_COUNT | parallel --joblog $PWD/parallel.log --bar --ssh oarsh --sshloginfile $PWD/sshloginfile --use-cpus-instead-of-cores $PWD/task.sh
EOF

We used the --use-cpus-instead-of-cores option of GNU Parallel, so that it will only run 2 tasks at a time on each node. The script will start the GNU Parallel command from the head node of the OAR job, and generate the "parallel.log" file.

Create the javascript code to render the gantt chart of the tasks

The parallel.log file is generated while GNU Parallel is executing the tasks. We create the ganttchart.html page to render the gratt chart from it.

cat <<'EOF' > ganttchart.html 
<html>
  <head>
    <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script>
    <script type="text/javascript">
      google.charts.load("current", {packages:["timeline"]});
      google.charts.setOnLoadCallback(getLogs);
      function getLogs(){
        var request = new XMLHttpRequest();
        request.open('GET', './parallel.log?_=' + new Date().getTime(), true);
	request.responseType = 'text';
        request.send(null);
        request.onreadystatechange = function () {
          if (request.readyState === 4 && request.status === 200) {
            var lines = request.responseText.split('\n').filter(function(line) { return line.length > 0; });
            lines.shift();
            var data = lines.map(function(line) {
              a = line.split('\t');
              return [ a[1], "Task "+a[0], new Date(a[2]*1000), new Date(a[2]*1000+a[3]*1000) ];
            });
            data.sort((a, b) => (a[0] > b[0]) ? 1 : ((a[0] < b[0]) ? -1 : ((a[1] > b[1]) ? 1 : -1)));
            var container = document.getElementById('gantt');
            var chart = new google.visualization.Timeline(container);
            var dataTable = new google.visualization.DataTable();
            dataTable.addColumn({ type: 'string', id: 'Ressource' });
            dataTable.addColumn({ type: 'string', id: 'JobId' });
            dataTable.addColumn({ type: 'date', id: 'Start' });
            dataTable.addColumn({ type: 'date', id: 'End' });
            dataTable.addRows(data);
            chart.draw(dataTable);
            setTimeout(getLogs, 3000);            
          }
        }
      }
    </script>
  </head>
  <body>
    <div id="gantt" style="height: 100%;"></div>
  </body>
</html>
Submit the OAR job named OAR2parallel requesting 8 nodes of the ecotype cluster, and executing the oarjob.sh script.
oarsub -n OAR2parallel -l nodes=8 -p "cluster='ecotype'" "./oarjob.sh 80"

80 tasks to run: roughly 10 tasks per node, 5 per CPUs (bi-CPU nodes), but let's see the actual dispatching in the gantt chart at the end !.

Open the ganttchart.html web page

Open the Grid'5000 VPN, so that your can access the web server and open the following URL: http://public.nantes.grid5000.fr/~YOUR_LOGIN/parallel/ganttchart.html (replace YOUR_LOGIN).

The gantt chart will show up and refresh every 3 seconds while GNU Parallel executes tasks.

For instance:

Parallel-gantt.png

Look at the progress of the job and gantt chart

GNU Parallel progress (--bar option) is written to the OAR job stderr file, we can look at in in a loop:

OAR_JOB_ID=$(oarstat -u | grep OAR2parallel | cut -f 1 -d\ ) \
while oarstat -u | grep -q $OAR_JOB_ID; do
  sleep 1 && [ -r OAR.OAR2parallel.$OAR_JOB_ID.stderr ] && cat OAR.OAR2parallel.$OAR_JOB_ID.stderr
done
28% 23:57=47s 39

Wait until it reaches 100% and the OAR job finishes.