Commit 3cd5eca1 authored by Benedikt von St. Vieth's avatar Benedikt von St. Vieth

Merge remote-tracking branch 'origin/jacopo_edit'

parents 482cf841 5b6bc278
......@@ -124,16 +124,21 @@ The psgw plugin for the ParaStation management daemon extends the Slurm commands
.. code-block:: none
--gw_num=number Number of gateway nodes
--gw_file=path Path to the gateway routing file
--gw_plugin=string Name of the route plugin
--gw_num=number Number of gateway nodes
A routing file will be generated in $HOME/psgw-route-$JOBID. The routing file is automatically removed when the allocation is revoked.
PSGWD Routing Plugins
+++++++++++++++++++++
PSGWD Routing
+++++++++++++
With the option ``gw_file`` an alternative location using an absolute path for the routing file can be specified:
The routing of MPI traffic across the Gateway nodes is performed by the ParaStation Gateway daemon on a per-node-pair basis.
When a certain number of gateway nodes is requested, an instance of psgwd is launched on each gateway.
By default, given the list of Cluster and Booster nodes obtained at allocation time, the system assigns each one of the Cluster node - Booster node pair to one of the instances of psgwd previously launched.
This mapping between Cluster and Booster nodes is saved into the routing file and used for the routing of the MPI traffic across the gateway nodes.
With the option ``gw_file`` an absolute path to a user-defined routing file can be specified:
.. code-block:: none
......@@ -141,12 +146,52 @@ With the option ``gw_file`` an alternative location using an absolute path for t
**Currently not available, will be available again with the next update:**
The route plugin can be changed using the ``gw_plugin`` option. Currently only the default plugin ``plugin01`` is available.
Using this approach one can adjust the actual routing of the MPI traffic.
Since creating a routing file requires knowledge of the list of nodes prior to their allocation, it is more convenient to modify the logic with which the node pairs are assigned to the gateway daemons.
This can be done via the ``gw_plugin`` option:
.. code-block:: none
srun --gw_plugin=$HOME/custom-route-plugin --gw_num=2 -N 1 hostname : -N 2 hostname
The ``gw_plugin`` option accepts either a label for a plugin already installed on the system, either a path to a user-defined plugin.
Currently two plugins are available on the JURECA system:
* ``plugin01`` is the default plugin (used when the ``gw_file`` is not used).
* ``plugin02`` is better suited for applications that use point-to-point communication between the same pairs of processes between Cluster and Booster, especially when the number of gateway nodes used is low.
The plugin file must include the functions associating a gateway node to a cluster node - booster node pair.
As an example, the code for ``plugin01`` is reported here:
.. code-block:: python
# Route function: Given the numerical Ids of nodes in partition A and B, the function
# returns a tuple (error, numeral of gateway)
def routeConnectionS(sizePartA, sizePartB, numGwd, numeralNodeA, numeralNodeB):
numeralGw = (numeralNodeA + numeralNodeB) % numGwd
return None, numeralGw
# Route function (extended interface): Make decision based on names of nodes to
# take topology into account
# def routeConnectionX(nodeListPartA, nodeListPartB, gwList, nodeA, nodeB):
# return Exception("Not implemented"), gwList[0]
routeConnectionX = None
In the case of 2 Cluster nodes, 2 Booster nodes and 2 Gateway nodes, this function results in the following mapping:
============ ============ ============
Cluster node Booster node Gateway node
============ ============ ============
0 0 0
1 0 1
0 1 1
1 1 0
============ ============ ============
PSGWD Gateway Assignment
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment