Restructure Parameter Declaration
PROBLEMS
Key problems
- mulitple independent declarations of
kwargs
(shouldn't be!) - not clear which kwargs leads to which data
- multiple
DataGenerator
initialisations - the name for local file storage of
DataPrep
class is not unique
A more detailed description about the problem and thoughts about it
It is a mess with all the different kwargs, args etc. Restructure the idea of how to implement the data set splits. Because there are multiple kwargs declarations and which counts in the end. And there are multiple declarations of the DataGenerator class. Why this? Is it somehow possible, to select elements from this iterator class. Furthermore the names of the DataPrep data files are not distinct, because there is no timerange provided in file's name. Given the case, that first to total DataGen is called with a short period for data loading. But then, for the data split (I don't know why this could happen, but it is very likely because of the current multiple declarations of kwargs arguments) the desired time range exceeds the previous mentioned and short time range. But nevertheless, the file with the short period is loaded (because the file name already exists, but for a shorter period) and used (DataPrep cuts the edges, but if the available period is shorter, than the edges of the available data is used as edges).
Proposal
- Brainstorm about which parameters need to be set and when
- Create precise list / workflow diagram when which decision is used
- outsource the experiment parameters from the
ExperimentSetup
class to external.yml
,.xml
or what ever files. All "flexible" parameters need to be defined here!
Work to do
- brainstorm
- create workflow diagram
- create format for the experiment setup files
- refactor the source code to match the decided structure