Tags give the ability to mark specific points in history as being important
  • v1.1.0 protected   hourly resolution support and new data handlers
    Release v1.1.0
    • general:
      • MLAir can be used with 1H resolution data from JOIN
      • new data handlers to use the Kolmogorov-Zurbenko filter and mixed sampling types
    • new features:
      • new data handler DataHandlerKzFilter to use Kolmogorov-Zurbenko filter (kz filter) on inputs (#195)
      • new data handler DataHandlerMixedSampling that can used mixed sampling types for input and target (#197)
      • new data handler DataHandlerMixedSamplingWithFilter that uses kz filter and mixed sampling (#197)
      • new data handler DataHandlerSeparationOfScales to filter-depended time steps sizes on filtered inputs using mixed sampling (#196)
    • technical:
      • bug fix for very short time series in TimeSeriesPlot (#215)
      • bug fix for variable dictionary when using hourly resolution (#212)
      • variable naming for data from JOIN interface harmonised (#206)
      • transformation setup is now separated for inputs and targets (#202)
      • bug fix in PlotClimatologicalSkillScore if only single station is used (#193)
      • preprocessed data is now stored inside experiment and not in the data folder
  • IntelliO3-ts-v1.0_R1-submit   This version was used for R1 of https://gmd.copernicus.org/preprints/gmd-2020-169/

    This version was used for R1 of https://gmd.copernicus.org/preprints/gmd-2020-169/

  • v1.0.0 protected   official release of new version 1.0.0
    Release v1.0.0
    • general:
      • This is the first official release of MLAir ready for use
      • updated license, installation instruction
    • technical:
      • restructured order of packages in requirements
  • v0.12.2 protected   HDFML support
    Release v0.12.2
    • general:
      • HDFML support
    • technical:
      • installation script for HDFML adjusted, #183
  • v0.12.1 protected
    b0b449f1 · release v0.12.1 ·
    Release v0.12.1
    • general:
      • introduced a notebook documentation for easy starting, #174
      • updated special installation instructions for the Juelich HPC systems, #172
    • new features:
      • names of input and output shape are renamed consistently to: input_shape, and output_shape, #175
    • technical:
      • it is possible to assign a custom name to a run module (e.g. used in logging), #173
  • v0.12.0 protected   Documentation and Bugfixes
    Release v0.12.0
    • general:
      • improved documentation include installation instructions and many examples from the paper, #153
      • bugfixes (see technical)
    • new features:
      • MyLittleModel is now a pure feed-forward network (before it had a CNN part), #168
    • technical:
      • new compile options check to ensure its execution, #154
      • bugfix for key errors in time series plot, #169
      • bugfix for not used kwargs in DefaultDataHandler, #170
      • trainable parameter is renamed by train_model to prevent confusion with the tf trainable parameter, #162
      • fixed HPC installation failure, #159
  • v0.11.0 protected   v0.11.0
    Release v0.11.0
    • general:
      • Introduce advanced data handling with much more flexibility (independent of TOAR DB, custom data handling is pluggable), #144
      • default data handler is still using TOAR DB
    • new features:
      • default data handler using TOAR DB refactored according to advanced data handling, #140, #141, #152
      • data sets are handled as collections, #142, and are itable in a standard way (StandardIterator) and optimised for keras (KerasIterator), #143
      • automatically moving station map plot, #136
    • technical:
      • model modules available from package, #139
      • renaming of parameter time dimension, #151
      • refactoring of README.md, #138
  • v0.10.0 protected   official name released MLAir, new Workflows, easy Model plug-in possible
    Release v0.10.0
    • general:
      • Official project name is released: MLAir (Machine Learning on Air data)
      • a model class can now easily be plugged in into MLAir. #121
      • introduced new concept of workflows, #134
    • new features:
      • workflows are used to execute a sequence of run modules, #134
      • default workflows for standard and the Juelich HPC systems are available, custom workflows can be defined, #134
      • seasonal decomposition is available for conditional quantile plot, #112
      • map plot is created with coordinates, #108
      • flatten_tails are now more general and easier to customise, #114
      • model classes have custom compile options (replaces set_loss), #110
      • model can be set in ExperimentSetup from outside, #121
      • default experiment settings can be queried using get_defaults(), #123
      • training and model settings are reported as MarkDown and Tex tables, #145
    • technical
      • Juelich HPC systems are supported and installation scripts are available, #106
      • data store is tracked, I/O is saved and illustrated in a plot, #116
      • batch size, epoch parameter have to be defined in ExperimentSetup, #127, #122
      • automatic documentation with sphinx, #109
      • default experiment settings are updated, #123
      • refactoring of experiment path and its default naming, #124
      • refactoring of some parameter names, #146
      • preparation for package distribution with pip, #119
      • all run scripts are updated to run with workflows, #134
      • the experiment folder is restructured, #130
  • IntelliO3-ts-v1.0_initial-submit   IntelliO3-ts version1.0;
    bf57b89b · Update README.md ·

    This version is used for IntelliO3-ts v1.0: A neural network approach to predict near-surface ozone concentrations in Germany" by F. Kleinert, L. H. Leufen and M. G. Schultz (2020, submitted to GMD, gmd-2020-169)

  • v0.9.0 protected   faster bootstraps, extreme value upsamling
    e945b365 · release v0.9.0 ·
    Release v0.9.0
    • general:
      • improved and faster bootstrap workflow
      • new plot PlotAvailability
      • extreme values upsampling
      • improved runtime environment
    • new features:
      • entire bootstrap workflow has been refactored and much faster now, can be skipped with evaluate_bootstraps=False, #60
      • upsampling of extreme values, set with parameter extreme_values=[your_values_standardised] (e.g. [1, 2]) and extremes_on_right_tail_only=<True/False> if only right tail of distribution is affected or both, #58, #87
      • minimal data length property (in total and for all subsets), #76
      • custom objects in model class to load customised model objects like padding class, loss, #72
      • new plot for data availability: PlotAvailability, #103
      • introduced (default) plot_list to specify which plots to draw
      • latex and markdown information on sample sizes for each station, #90
    • technical:
      • implemented tests on gpu and from scratch for develop, release and master branches, #95
      • usage of tensorflow 1.13.1 (gpu / cpu), separated in 2 different requirements, #81
      • new abstract plot class to have uniform plot class design
      • New time tracking wrapper to use for functions or classes
      • improved logger (info on display, debug into file), #73, #85, #88
      • improved run environment, especially for error handling, #86
      • prefix general in data store scope is now optional and can be skipped. If given scope is not general, it is treated as subscope, #82
      • all 2D Padding classes are now selected by Padding2D(padding_name=<padding_type>) e.g. Padding2D(padding_name="SymPad2D"), #78
      • custom learning rate (or lr_decay) is optional now, #71
  • v0.8.3 protected   applied fix to ols model
    Release v0.8.3
    • bug:
      • if input was constant, adding the constant for the linear model failed. This is corrected now.
  • v0.8.2 protected   applied fix to model plot path
    Release v0.8.2
    • bug:
      • if path name contains the dot sign ".", the model plot path creation failed. This is corrected now.
  • v0.8.1 protected   applied fix to check valid station
    Release v0.8.1
    • bug:
      • history could become None, if intersection with labels and or observation is empty. This needs to be checked during the check valid station method. Stations with None history are removed now from set.
  • v0.8.0 protected   bootstraps, new paddings
    Release v0.8.0
    • general:
      • bootstraps can be calculated (but still very slow, refac will be part of new version)
      • new padding options
    • new features:
      • Bootstrap class to handle bootstraps, bootstrap skill scores, bootstrap plot
      • input and target variables are independent now (before target had to be part of input space too)
      • implemented symmetric and reflection padding
    • technical:
      • some tests always failed because strange keras naming behaviour (tests have been adapted to this now)
  • v0.7.0 protected   new data transformation option, advanced training options
    Release v0.7.0
    • general:
      • improved callback handling
      • training can start from scratch ("restart training")
      • new data transformation options
    • new features:
      • new class CallbackHandler to add/reload callbacks, create checkpoints and retrieve callbacks in proper format and order
      • in addition to the already supported station-wise transformation, it is now possible to apply a specific transformation for the entire dataset. More information in README.md
      • Data can be randomly permuted inside a minibatch (can already be used, but is initially a preparation for [not implemented] extreme value weighting )
    • technical:
      • refac of naming conventions: observations are now called obs, original predictions from CNN can be called orig_pred (preparation for bootstraps and its permuted predictions)
      • security setup for join database settings
  • v0.6.0 protected   hourly data, result plots, advanced training setup
    Release v0.6.0
    • general:
      • implement plot routines
      • advanced training setup (can be resumed and skipped)
      • MLT supports now hourly data
      • src/run_modules/README.md gives a short overview on the data workflow and the project structure
    • new features:
      • climatological and competitive skill scores
      • many new plots: station map, monthly box plot, conditional quantiles plot, climatological skill score, competitive skill score, time series plot
      • training is resumed, if last epoch of an existing model is smaller than the given epoch number
      • if a model is available, training can be skipped
      • advanced plot of model history including all model branches
    • technical:
      • improved speed of data generator by temporary storing processed data locally as pickle
      • there are now two separate run scripts (run.py and run_hourly.py) with some resolution specific settings
      • data store got method get_default() that behaves similar to the standard dict.get()
  • v0.5.0 protected   includes model setup, training and post-processing
    eff70314 · include new development ·
    Release v0.5.0
    • general:
      • introduced modules for model setup, training and post-processing
    • new features:
      • model setup: create model with all parameters, plot architectur
      • model class: collects model and loss, more general workflow is possible now
      • training: train model (not distributed on multiple CPUs/GPUs), monitor error and learning rate with plots and file output
      • postprocessing: evaluate trained network, create linear fit and persistence forecast, plot error metrics (not all plots are finished yet)
      • Distributor splits data into mini batches
      • create a dictionary with requested entries from data store
      • enhanced data loading from join interface
    • technical:
      • modules to run in the experiment pipeline can be found now in run_modules (before modules)
      • test teardown method for better test independence
      • updated requirements.txt
      • data store function to store data is renamed from put to set
      • new time display format for TimeTracking
      • more tests and documentation
  • v0.4.0 protected   running PreProcessing
    Release v0.4.0
    • general:
      • Data pre-processing included
    • new features:
      • class PreProcessing handles data download and storages and pre-processes data in required format for training (data set split, interpolation, ...)
      • perform data check station-wise to detect stations, that do not fulfil the requirements.
    • technical:
      • introduced a data storage fpr better namespace depended parameter handling (e.g. differences in training and test stage)
  • v0.3.0 protected   Experiment Setup finished
    3b54d872 · Experiment Setup finished ·
    Release v0.3.0
    • general:
      • Experiment Setup implemented
    • new features:
      • new run object with time tracking
      • time tracking enabled (is used by all inheritances of the run class)
      • class ExperimentSetup sets paths and options (expansion to be continued)
    • technical:
      • run class is new runtime object including time tracking
  • v0.2.0 protected   new DataGenerator class and more ML functionality
    Release v0.2.0
    • general:
      • Data Generator class is implemented
      • CI is refined
      • more ML functionality
    • new features:
      • new class DataGenerator to seperate data in multiple batches
      • L<p> loss function implemented (e.g. L1=MAE, L2=MSE)
      • Learning rate decay for machine learning
    • technical:
      • modified test in CI to get html results