"This notebook contains all examples as provided in Leufen et al. (2020). \n",
"Please follow the installation instructions provided in the [README](https://gitlab.version.fz-juelich.de/toar/mlair/-/blob/master/README.md) on gitlab. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example 1\n",
"\n",
"The following cell imports MLAir and executes a minimalistic toy experiment. This cell is equivalent to Figure 2 in the manuscript."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import mlair\n",
"\n",
"# just give it a dry run without any modifications\n",
"mlair.run()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example 2 \n",
"\n",
"In the following cell we use other station IDs provided as a list of strings (see also [JOIN-Web interface](https://join.fz-juelich.de/services/rest/surfacedata/) of the TOAR database for more details).\n",
"Moreover, we expand the `window_history_size` to 14 days and run the experiment. This cell is equivalent to Figure 3 in the manuscript."
"# expanded temporal context to 14 (days, because of default sampling=\"daily\")\n",
"window_history_size = 14\n",
"\n",
"# restart the experiment with little customisation\n",
"mlair.run(stations=stations, \n",
" window_history_size=window_history_size)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example 3 \n",
"\n",
"The following cell loads the trained model from Example 2 and generates predictions for the two specified stations. \n",
"To ensure that the model is not retrained the keywords `create_new_model` and `train_model` are set to `False`. This cell is equivalent to Figure 4 in the manuscript. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# our new stations to use\n",
"stations = ['DEBY002', 'DEBY079']\n",
"\n",
"# same setting for window_history_size\n",
"window_history_size = 14\n",
"\n",
"# run experiment without training\n",
"mlair.run(stations=stations, \n",
" window_history_size=window_history_size, \n",
" create_new_model=False, \n",
" train_model=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example 4\n",
"\n",
"The following cell demonstrates how a user defined model can be implemented by inheriting from `AbstractModelClass`. Within the `__init__` method `super().__init__`, `set_model` and `set_compile_options` should be called. Moreover, it is possible to set custom objects by calling `set_custom_objects`. Those custom objects are used to re-load the model (see also Keras documentation). For demonstration, the loss is added as custom object which is not required because a Keras built-in function is used as loss.\n",
"\n",
"The Keras-model itself is defined in `set_model` by using the sequential or functional Keras API. All compile options can be defined in `set_compile_options`.\n",
"This cell is equivalent to Figure 5 in the manuscript."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import keras\n",
"from keras.losses import mean_squared_error as mse\n",
"Embedding of a custom Run Module in a modified MLAir workflow. In comparison to examples 1 to 4, this code example works on a single step deeper regarding the level of abstraction. Instead of calling the run method of MLAir, the user needs to add all stages individually and is responsible for all dependencies between the stages. By using the `Workflow` class as context manager, all stages are automatically connected with the result that all stages can easily be plugged in. This cell is equivalent to Figure 6 in the manuscript."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import logging\n",
"\n",
"class CustomStage(mlair.RunEnvironment):\n",
" \"\"\"A custom MLAir stage for demonstration.\"\"\"\n",
" def __init__(self, test_string):\n",
" super().__init__() # always call super init method\n",
" self._run(test_string) # call a class method\n",
" \n",
" def _run(self, test_string):\n",
" logging.info(\"Just running a custom stage.\")\n",
@@ -17,17 +17,19 @@ install the geo packages. For special instructions to install MLAir on the Jueli
* (geo) Install **proj** on your machine using the console. E.g. for opensuse / leap `zypper install proj`
* (geo) A c++ compiler is required for the installation of the program **cartopy**
* Install all requirements from [`requirements.txt`](https://gitlab.version.fz-juelich.de/toar/machinelearningtools/-/blob/master/requirements.txt)
* Install all requirements from [`requirements.txt`](https://gitlab.version.fz-juelich.de/toar/mlair/-/blob/master/requirements.txt)
preferably in a virtual environment
* (tf) Currently, TensorFlow-1.13 is mentioned in the requirements. We already tested the TensorFlow-1.15 version and couldn't
find any compatibility errors. Please note, that tf-1.13 and 1.15 have two distinct branches each, the default branch
for CPU support, and the "-gpu" branch for GPU support. If the GPU version is installed, MLAir will make use of the GPU
device.
* Installation of **MLAir**:
* Either clone MLAir from the [gitlab repository](https://gitlab.version.fz-juelich.de/toar/machinelearningtools.git)
* Either clone MLAir from the [gitlab repository](https://gitlab.version.fz-juelich.de/toar/mlair.git)
and use it without installation (beside the requirements)
* or download the distribution file (?? .whl) and install it via `pip install <??>`. In this case, you can simply
import MLAir in any python script inside your virtual environment using `import mlair`.
* or download the distribution file ([current version](https://gitlab.version.fz-juelich.de/toar/mlair/-/blob/master/dist/mlair-0.12.1-py3-none-any.whl))
and install it via `pip install <dist_file>.whl`. In this case, you can simply import MLAir in any python script
inside your virtual environment using `import mlair`.
# How to start with MLAir
...
...
@@ -47,15 +49,19 @@ mlair.run()
The logging output will show you many informations. Additional information (including debug messages) are collected
inside the experiment path in the logging folder.
```log
INFO: mlair started
INFO: DefaultWorkflow started
INFO: ExperimentSetup started
INFO: Experiment path is: /home/<usr>/mlair/testrun_network
...
INFO: load data for DEBW001 from JOIN
INFO: load data for DEBW107 from JOIN
INFO: load data for DEBY081 from JOIN
INFO: load data for DEBW013 from JOIN
INFO: load data for DEBW076 from JOIN
INFO: load data for DEBW087 from JOIN
...
INFO: Training started
...
INFO: mlair finished after 00:00:12 (hh:mm:ss)
INFO: DefaultWorkflow finished after 0:03:04 (hh:mm:ss)
```
## Example 2
...
...
@@ -77,15 +83,17 @@ mlair.run(stations=stations,
```
The output looks similar, but we can see, that the new stations are loaded.
```log
INFO: mlair started
INFO: DefaultWorkflow started
INFO: ExperimentSetup started
...
INFO: load data for DEBW030 from JOIN
INFO: load data for DEBW037 from JOIN
INFO: load data for DEBW031 from JOIN
INFO: load data for DEBW015 from JOIN
...
INFO: Training started
...
INFO: mlair finished after 00:00:24 (hh:mm:ss)
INFO: DefaultWorkflow finished after 00:02:03 (hh:mm:ss)
```
## Example 3
...
...
@@ -107,15 +115,15 @@ window_history_size = 14
mlair.run(stations=stations,
window_history_size=window_history_size,
create_new_model=False,
trainable=False)
train_model=False)
```
We can see from the terminal that no training was performed. Analysis is now made on the new stations.
```log
INFO: mlair started
INFO: DefaultWorkflow started
...
INFO: No training has started, because trainable parameter was false.
INFO: No training has started, because train_model parameter was false.
...
INFO: mlair finished after 00:00:06 (hh:mm:ss)
INFO: DefaultWorkflow finished after 0:01:27 (hh:mm:ss)
```
...
...
@@ -137,7 +145,7 @@ DefaultWorkflow.run()
```
The output of running this default workflow will be structured like the following.
```log
INFO: mlair started
INFO: DefaultWorkflow started
INFO: ExperimentSetup started
...
INFO: ExperimentSetup finished after 00:00:01 (hh:mm:ss)
...
...
@@ -153,7 +161,7 @@ INFO: Training finished after 00:02:15 (hh:mm:ss)
INFO: PostProcessing started
...
INFO: PostProcessing finished after 00:01:37 (hh:mm:ss)
INFO: mlair finished after 00:04:05 (hh:mm:ss)
INFO: DefaultWorkflow finished after 00:04:05 (hh:mm:ss)
```
# Customised Run Module and Workflow
...
...
@@ -199,7 +207,7 @@ CustomWorkflow.run()
The output will look like:
```log
INFO: mlair started
INFO: Workflow started
...
INFO: ExperimentSetup finished after 00:00:12 (hh:mm:ss)
INFO: CustomStage started
...
...
@@ -207,7 +215,7 @@ INFO: Just running a custom stage.
INFO: test_string = Hello World
INFO: epochs = 128
INFO: CustomStage finished after 00:00:01 (hh:mm:ss)