Initial commit

parents
.. include:: system.rst
.. _het_modular_jobs:
Heterogeneous and Cross-Module Jobs
===================================
.. _het_modular_jobs_overview:
Overview
--------
.. _het_modular_jobs_slurm:
Slurm Support for Heterogeneous Jobs
------------------------------------
For detailed information about Slurm, please take a look on the :ref:`Quick Introduction <quickintro>` and :ref:`Batch system <batchsystem>` page.
With Slurm 17.11 support for Heterogeneous Jobs was introduced. This allows to spawn a job across multiple partitions of a cluster, and across different Modules of our Supercomputers. See the official Slurm documentation (SlurmHetJob_) for additional informations on this feature.
.. _SlurmHetJob: https://slurm.schedmd.com/heterogeneous_jobs.html
**salloc/srun**
.. code-block:: none
salloc -A <budget account> -p <batch, ...> : -p <booster, ...> [ : -p <booster, ...> ]
srun ./prog1 : ./prog2 [ : ./progN ]
**sbatch**
.. code-block:: none
#!/bin/bash
#SBATCH -A zam
#SBATCH -p <batch, ...>
#SBATCH packjob
#SBATCH -p <booster, ...>
srun ./prog1 : ./prog2
.. _het_modular_jobs_software:
Loading Software in a Heterogeneous Environment
-----------------------------------------------
Executing applications in a modular environment, especially when different Modules have different architectures or the dependencies of programs are not uniform, can be a challenging tasks.
**Uniform Architecture and Dependencies**
As long as the Architecture of the given modules are uniform and there are not mutually exclusive dependencies for the binaries that are going to be executed, one can rely on the ``module`` command. Take a look on the :ref:`Quick Introduction <quickintro>` if ``module`` is new for you.
.. code-block:: none
#!/bin/bash -x
#SBATCH ...
module load [...]
srun ./prog1 : ./prog2
**Non Uniform Architectures and Mutual Exclusive Dependencies**
A tool called ``xenv`` was implement to ease the task of loading modules for heterogeneous jobs. For details on supported command line arguments, execute ``xenv -h`` on the given system.
.. code-block:: none
srun --account=<budget account> --partition=<batch, ...> xenv -L intel-para IMB-1 : --partition=<knl, ...> xenv -L Architecture/KNL intel-para IMB-1
.. ifconfig:: system_name == 'jureca'
.. _het_modular_jobs_mpi_bridges:
MPI Traffic Across Modules
--------------------------
When the nodes of a job belong to different interconnects and MPI communication is used, bridging has to take place. To support this workflow, e.g. run a job on a Cluster with Infiniband and a Booster with OmniPath, a Gateway Daemon (psgwd, ParaStation Gateway Daemon) was implemented that takes care of moving packages across fabrics.
To request gateway nodes for a job the mandatory option --gw_num has to be specified. In addition, communication with the psgwd has to be ensured via loading the software module **pscom-gateway** either via ``xenv`` or the ``module`` command.
To start an interactive pack job using two gateway nodes the following command can be used:
.. code-block:: none
export PSP_GATEWAY=2
srun -A <budget account> -p <cluster, ...> --gw_num=2 xenv -L pscom-gateway ./prog1 : -p <booster, ...> xenv -L pscom-gateway ./prog2
Where ``PSP_GATEWAY=2`` ensures that the gateway protocol is used, **not** that two gateways are used!
For debugging purposes, and to make sure the gateways are used, you might use
.. code-block:: none
export PSP_DEBUG=3
You should see output like
.. code-block:: none
<PSP:r0000003:CONNECT (192.168.12.34,26708,0x2,r0000003) to (192.168.12.41,29538,0x2,r0000004) via gw>
<PSP:r0000004:ACCEPT (192.168.12.34,26708,0x2,r0000003) to (192.168.12.41,29538,0x2,r0000004) via gw>
PSGWD
~~~~~
The psgw plugin for the ParaStation management daemon extends the Slurm commands salloc, srun and sbatch with the following options:
.. code-block:: none
--gw_file=path Path to the gateway routing file
--gw_plugin=string Name of the route plugin
--gw_num=number Number of gateway nodes
A routing file will be generated in $HOME/psgw-route-$JOBID. The routing file is
automatically removed when the allocation is revoked. With the option --gw_file an
alternative location using an absolute path for the routing file can be specified:
.. code-block:: none
srun --gw_file=/home-fs/rauh/route-file --gw_num=2 -N 1 hostname : -N 2 hostname
The route plugin can be changed using the --gw_plugin option. Currently only the
default plugin “plugin01” is available.
.. code-block:: none
srun --gw_plugin=plugin01 --gw_num=2 -N 1 hostname : -N 2 hostname
If more gateways were requested than available the slurmctld prologue will fail for
interactive jobs
.. code-block:: none
srun --gw_num=3 -N 1 hostname : -N 2 hostname
srun: psgw: requesting 3 gateway nodes
srun: job 158553 queued and waiting for resources
srun: job 158553 has been allocated resources
srun: PrologSlurmctld failed, job killed
srun: Force Terminated job 158553
srun: error: Job allocation 158553 has been revoked
If batch jobs run out of gateway resources they will be re-queued and have to wait for
10 minutes before becoming eligible to start again.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment