Update -L pscom-gateway/msa_fix_ld things, try to make it more understandable...

Update -L pscom-gateway/msa_fix_ld things, try to make it more understandable while those ugly workarounds are still... not that easy.
parent 367af959
...@@ -73,17 +73,42 @@ MPI Traffic Across Modules ...@@ -73,17 +73,42 @@ MPI Traffic Across Modules
-------------------------- --------------------------
When the nodes of a job belong to different interconnects and MPI communication is used, bridging has to take place. To support this workflow, e.g. run a job on a Cluster with Infiniband and a Booster with OmniPath, a Gateway Daemon (psgwd, ParaStation Gateway Daemon) was implemented that takes care of moving packages across fabrics. When the nodes of a job belong to different interconnects and MPI communication is used, bridging has to take place. To support this workflow, e.g. run a job on a Cluster with Infiniband and a Booster with OmniPath, a Gateway Daemon (psgwd, ParaStation Gateway Daemon) was implemented that takes care of moving packages across fabrics.
To request gateway nodes for a job, the mandatory option ``gw_num`` has to be specified at submit/allocation time. In addition, communication with the psgwd has to be ensured via loading the software module **pscom-gateway** either via ``xenv`` or the ``module`` command. Loading MPI
~~~~~~~~~~~
**JURECA Cluster**
Communication with the psgwd has to be ensured via loading the software module **pscom-gateway** either via ``xenv`` or the ``module`` command.
**JURECA Booster, Current MPI Workaround (April/May/... 2019)**
For the time being, prefixing JURECA **Booster** binaries via ``msa_fix_ld`` is necessary. This is due to the fact that the installed libmpi version does not support the psgwd. We hope this will go away soon.
``msa_fix_ld`` is modifying the environment, so it might influence the modules you load.
.. code-block:: none
#!/bin/bash
export PSP_PSM=1
export LD_LIBRARY_PATH="/usr/local/jsc/msa_parastation_mpi/lib:/usr/local/jsc/msa_parastation_mpi/lib/mpi-hpl-gcc/:${LD_LIBRARY_PATH}"
$*
Requesting Gateways
~~~~~~~~~~~~~~~~~~~
To request gateway nodes for a job, the mandatory option ``gw_num`` has to be specified at submit/allocation time.
- There are in total 198 Gateways available. - There are in total 198 Gateways available.
- The Gateways are exclusive resources, they are not shared across user jobs. This may change in the future. - The Gateways are exclusive resources, they are not shared across user jobs. This may change in the future.
- There is currently no enforced maximum on the number of Gateways per job, beside of the total number of Gateways. This may change in the future. - There is currently no enforced maximum on the number of Gateways per job, beside of the total number of Gateways. This may change in the future.
Submitting Jobs
~~~~~~~~~~~~~~~
To start an interactive pack job using two gateway nodes the following command must be used: To start an interactive pack job using two gateway nodes the following command must be used:
.. code-block:: none .. code-block:: none
srun -A <budget account> -p <batch, ...> --gw_num=2 xenv -L pscom-gateway msa_fix_ld ./prog1 : -p <booster, ...> xenv -L pscom-gateway msa_fix_ld ./prog2 srun -A <budget account> -p <batch, ...> --gw_num=2 xenv [-L ...] -L pscom-gateway ./prog1 : -p <booster, ...> xenv [-L ...] msa_fix_ld ./prog2
When submitting a job that will run later, you have to specify the number of gateways at submit time: When submitting a job that will run later, you have to specify the number of gateways at submit time:
...@@ -99,20 +124,8 @@ When submitting a job that will run later, you have to specify the number of gat ...@@ -99,20 +124,8 @@ When submitting a job that will run later, you have to specify the number of gat
#SBATCH packjob #SBATCH packjob
#SBATCH -p <booster, ...> #SBATCH -p <booster, ...>
srun xenv -L pscom-gateway msa_fix_ld ./prog1 : xenv -L pscom-gateway msa_fix_ld ./prog2 srun xenv [-L ...] -L pscom-gateway ./prog1 : xenv [-L ...] msa_fix_ld ./prog2
Current MPI Workaround (April 2019)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For the time being, prefixing binaries via ``msa_fix_ld`` is necessary. This is due to the fact that the installed libmpi version does not support the psgwd. We hope this will go away soon.
``msa_fix_ld`` is modifying the environment, so it might influence the modules you load.
.. code-block:: none
#!/bin/bash
export PSP_PSM=1
export LD_LIBRARY_PATH="/usr/local/jsc/msa_parastation_mpi/lib:/usr/local/jsc/msa_parastation_mpi/lib/mpi-hpl-gcc/:${LD_LIBRARY_PATH}"
$*
PSGWD PSGWD
~~~~~ ~~~~~
...@@ -143,13 +156,13 @@ By default, given the list of Cluster and Booster nodes obtained at allocation t ...@@ -143,13 +156,13 @@ By default, given the list of Cluster and Booster nodes obtained at allocation t
This mapping between Cluster and Booster nodes is saved into the routing file and used for the routing of the MPI traffic across the gateway nodes. This mapping between Cluster and Booster nodes is saved into the routing file and used for the routing of the MPI traffic across the gateway nodes.
**Currently not available, will be available again with the next update:** **Currently not available, will be available again with the next update:**
Since creating a routing file requires knowledge of the list of nodes prior to their allocation, it is more convenient to modify the logic with which the node pairs are assigned to the gateway daemons. Since creating a routing file requires knowledge of the list of nodes prior to their allocation, it is more convenient to modify the logic with which the node pairs are assigned to the gateway daemons.
This can be done via the ``gw_plugin`` option: This can be done via the ``gw_plugin`` option:
.. code-block:: none .. code-block:: none
srun --gw_plugin=$HOME/custom-route-plugin --gw_num=2 -N 1 hostname : -N 2 hostname srun --gw_plugin=$HOME/custom-route-plugin --gw_num=2 -N 1 hostname : -N 2 hostname
The ``gw_plugin`` option accepts either a label for a plugin already installed on the system, either a path to a user-defined plugin. The ``gw_plugin`` option accepts either a label for a plugin already installed on the system, either a path to a user-defined plugin.
Currently two plugins are available on the JURECA system: Currently two plugins are available on the JURECA system:
...@@ -157,7 +170,7 @@ Currently two plugins are available on the JURECA system: ...@@ -157,7 +170,7 @@ Currently two plugins are available on the JURECA system:
* ``plugin01`` is the default plugin (used when the ``gw_file`` is not used). * ``plugin01`` is the default plugin (used when the ``gw_file`` is not used).
* ``plugin02`` is better suited for applications that use point-to-point communication between the same pairs of processes between Cluster and Booster, especially when the number of gateway nodes used is low. * ``plugin02`` is better suited for applications that use point-to-point communication between the same pairs of processes between Cluster and Booster, especially when the number of gateway nodes used is low.
The plugin file must include the functions associating a gateway node to a cluster node - booster node pair. The plugin file must include the functions associating a gateway node to a cluster node - booster node pair.
As an example, the code for ``plugin01`` is reported here: As an example, the code for ``plugin01`` is reported here:
.. code-block:: python .. code-block:: python
...@@ -169,7 +182,7 @@ As an example, the code for ``plugin01`` is reported here: ...@@ -169,7 +182,7 @@ As an example, the code for ``plugin01`` is reported here:
return None, numeralGw return None, numeralGw
# Route function (extended interface): Make decision based on names of nodes to # Route function (extended interface): Make decision based on names of nodes to
# take topology into account # take topology into account
# def routeConnectionX(nodeListPartA, nodeListPartB, gwList, nodeA, nodeB): # def routeConnectionX(nodeListPartA, nodeListPartB, gwList, nodeA, nodeB):
# return Exception("Not implemented"), gwList[0] # return Exception("Not implemented"), gwList[0]
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment