Commit 6666cd85 authored by Federico Rossi's avatar Federico Rossi

inited sve optimized version

parent 59c6c428
This diff is collapsed.
This source diff could not be displayed because it is too large. You can view the blob instead.
COPYRIGHT
All contributions by Taiga Nomi
Copyright (c) 2013, Taiga Nomi
All rights reserved.
All other contributions:
Copyright (c) 2013-2016, the respective contributors.
All rights reserved.
Each contributor holds copyright over their respective contributions.
The project versioning (Git) records all such contribution source information.
LICENSE
The BSD 3-Clause License
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of tiny-dnn nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# tinyDNNAlt
This repository contains the tiny-DNN library, adapted to work with alternative
representations for real numbers, such as the posit representation.
# MNIST All-in-one command
```sh
./all.sh
```
The above command will compile both training and testing examples and will execute the most
relevant ones producing result files in the folders *plot* and *pretrained_models*.
If you want to execute it for only one particular type see instructions below.
# Build MNIST example(s)
```sh
mkdir -p build
cd build
cmake .. -DTAB8=ON -DTAB10=ON
make tests_type trains_type -j$(nproc)
```
The latter will compile the training and testing code for the different types of Posit and Float.
# Training
```sh
cd build/examples
# choose one type for training (for example Posit16,2)
./example_mnist_train_p16_2 --data_path ../../data --num_train 60000 --epochs 30 --report_level 2
# Above command will produce a trained model in the folder 'pretrained_models' and some report files
# based on the chosen report_level in the folder 'plot'
```
# Testing
```sh
cd build/examples
# choose one type for training (for example Posit10,0 tabulated)
./example_mnist_test_alt_posittab10 ../../pretrained_models/model_name ../../data
# Above command will produce some report files inside the 'plot' folder
```
\ No newline at end of file
TODO
==============================================
- [X] fare in modo che tiny-DNN
- [X] stampi il numero di training pattern
- [X] usi un sottoinsieme dei pattern di training, specificabile come parametro o percentuale
- [X] fissare il seed all'inizio, per avere ripetibilità degli esperimenti
- [X] stampare alla fine dell'addestramento la percentuale di corretta classificazione
su training e su test set
- [X] integrare Posit 32 bit con due bit di esponente e rifare l'addestramento
- [X] Serializzazione dei Posit usando la libreria Cereal (da implementare dentro la classe posit) (chiedere a Ruffaldi?)
- *al momento posit->float->serializzazione e deserializzazione->float->posit*
- [X] esportare su file la percentuale di corretta classificazione su training e test set
epoca per epoca, in modo da poter fare un plot in Matlab
- [X] verificare se esiste un costruttore di conversione per tipi unsigned int (posit::posit(unsigned int))
- [X] chiedere a Ruffaldi se fa il fix dell'assegnamento di un float ad un posit
- [~] generalizzare il codice in modo che ammetta il training con un tipo di dato (float_t_tr)
e la fase di query con un altro tipo (float_t_qr), che tipicamente avrà un minor numero di bit.
Stampare a video le prestazioni sul test set ottenute alla fine del training su più bit
e quelle che si ottengono a quel punto passando ad un minor numero di bit
- Al momento tenendo separate le due fasi di training e testing, avendo due .cpp distinti
ho sfruttato la de/serializzazione attraverso il tipo comune float per addestrare la rete con il tipo float, serializzare modello+pesi su file
, ricaricare il modello nel nuovo programma di testing (compilato con il supporto ai posit) ed eseguire il testing.
- Per avere entrambi i tipi durante la fase di training ci sarà da riscrivere pesantemente tiny_dnn visto che (per quanto ho trovato fino ad ora) non c'è la templatizzazione di *float_t* in tutte le classi
ma un *typedef float_t* definito fin dalla fase di compilazione.
- [X] fare in modo che una rete neurale addestrata con i posit possa essere salvata su file e ricaricata
da file. Al momento per i float viene usato il metodo serialize. Dobbiamo serializzare (e deserializzare)
i posit.
- Vedi sopra
- [X] Provare il training con Posit8,0
- [X] Provare ad utilizzare la funzione di attivazione softsign come funzione di attivazione
- [~] studiare le metriche per il confronto ed implementarle: come si fa a quantificare quanto va meglio
un posit16 rispetto ad un float16, ad esempio
- [ ] chiedere a Ruffaldi se implementa sqrt and log
- [~] utilizzare posit tabulati invece che posit non tabulati
- [~] implementare il prodotto passando al logaritmo tabulato, in modo da non dover tabulare ne' il prodotto
ne' la divisione (questo consentira' di poter usare anche posit12 tabulati, perche' in questo modo
non dovrebbe essere invalidata la cache)
- [ ] caratterizzare dal punto di vista matematico questa soluzione.
- [ ] Su quanti bit occorre tabulare il logaritmo?
- [ ] Aggiungere qualche altro conto interessante, per irrobustire il lato teorico dell'articolo
- [ ] drafter l'articolo
- [ ] sottomettere su Springer Journal of Real-Time Image Processing
- [~] aggiungere altri benchmarks di image processing
- [X] MNIST-fashion [dataset](https://github.com/zalandoresearch/fashion-mnist), [benchmarks](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/)
- [~] CIFAR10 [qua](http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html) FARE Anche il training
- [~] ImageNet [qua](http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html) FARE Anche il training
- [~] altri benchmark di image processing, possibilmente legati a autonomous driving: ne esistono su internet?
- https://arxiv.org/pdf/1704.08545.pdf Real time semantic segmentation
- http://openaccess.thecvf.com/content_cvpr_2016/papers/Cordts_The_Cityscapes_Dataset_CVPR_2016_paper.pdf come sopra, paper sul dataset
- http://cs231n.stanford.edu/reports/2015/pdfs/lediurfinal.pdf benchmarks sul dataset Stanford Car
- [X] Generalizzare la funzione sigmoid in cppPosit_private in modo che utilizzi la pseudosigmoid (con manipolazione di bit)
quando si utilizza un Posit con 0 bit di esponente, mentre utilizzi l'espressione standard in caso contrario
- [] Correggere la manipolazione di bit in cppPosit
- [ ] trovare il modo di addestrare tinyDNN su GPU (usando il PC del Cococcioni)
- [X] profilare il codice, per indiviare i bottleneck (valgrind + interfaccia grafica Qt)
- La funzione che viene chiamata più volte e su cui viene perso più tempo è la "forward" del layer di convoluzione
- [ ] documentare il codice ed gli esempi usando doxygen
- [ ] riuscire a far girare le demo dentro jupiter (live test deli esempi)
- [ ] dotarsi di un PC presso il laboratorio di Saponara, con cui collegarsi in remoto al supercomputer di Saponara (100 cores, 128 Gb of RAM)
- [ ] trovare il modo per implementare l'exact dot product (con o senza quires), per migliorare
ulteriormente l'accuratezza. Le reti neurali coinvolgono sistematicamente i dot-products
- Dovrei aver individuato dove è attualmente implementato il dot-product da tinyDNN: [qua](https://bitbucket.org/marco_cococcioni/tinydnnalt/src/65ab9512cdc4538f88f0f4f31438f41096d5cf53/tiny_dnn/util/product.h#lines-320)
NB: forse questa parte conviene approfondirla in futuro e concentrarsi prima sul paper.
Una volta sottomesso, torneremo sull'exact dot product
NOTE
==================================================
# Applicazioni *non* safety-critical
La comunita' scientifica attualmente ritiene che servano:
- float a 16 bit per la fase di training
- float a 8 bit per la fase di test
# Applicazioni *safety-critical*
La comunita' scientifica attualmente ritiene che servano:
- float a 32 bit per la fase di training
- float a 16 bit per la fase di test
# Nostri teoremi per applicazioni *safety-critical*
- Il tipo posit14 è sufficiente per la fase di query. Meglio ancora se bastasse un posit12
Il non-plus ultra sarebbe farsi certificare una rete neurale posit8 per la sola fase di query
(a quel punto i vantaggi sarebbero incredibili: minore memoria, minore banda,
miglior utilizzo dei registri del processore, miglior utilizzo della cache)
- Per la fase di training, posit16 è sufficiente rispetto a float32.
Addirittura un posit14 potrebbe essere sufficiente per il training.
A questo punto la cosa si farebbe molto interessante, visto che posit14 si può tabulare.
Questo avrebbe tanti vantaggi, fra cui la maggior "garantibilità" delle operazioni hardware,
poichè non si utilizzerebbe nè una FPU nè una PPU ma solo una LUT.
# Prestazioni so far
https://docs.google.com/document/d/1QAyAU6VI0LsnWQxCdbTwDSOqPA8_C1BUhAWd67NxF0c/edit?usp=sharing
# Vision
===================================================
- Dobbiamo focalizzarci su applicazioni di image processing orientate
alla guida assistita/autonoma, in ambiente dunque safety critical.
- Va da se che utilizzeremo prevalentemente reti convolutive.
This diff is collapsed.
/*! \file adapters.hpp
\brief Archive adapters that provide additional functionality
on top of an existing archive */
/*
Copyright (c) 2014, Randolph Voorhies, Shane Grant
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of cereal nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL RANDOLPH VOORHIES OR SHANE GRANT BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef CEREAL_ARCHIVES_ADAPTERS_HPP_
#define CEREAL_ARCHIVES_ADAPTERS_HPP_
#include <cereal/details/helpers.hpp>
#include <utility>
namespace cereal
{
#ifdef CEREAL_FUTURE_EXPERIMENTAL
// Forward declaration for friend access
template <class U, class A> U & get_user_data( A & );
//! Wraps an archive and gives access to user data
/*! This adapter is useful if you require access to
either raw pointers or references within your
serialization functions.
While cereal does not directly support serialization
raw pointers or references, it is sometimes the case
that you may want to supply something such as a raw
pointer or global reference to some constructor.
In this situation this adapter would likely be used
with the construct class to allow for non-default
constructors.
@note This feature is experimental and may be altered or removed in a future release. See issue #46.
@code{.cpp}
struct MyUserData
{
int * myRawPointer;
std::reference_wrapper<MyOtherType> myReference;
};
struct MyClass
{
// Note the raw pointer parameter
MyClass( int xx, int * rawP );
int x;
template <class Archive>
void serialize( Archive & ar )
{ ar( x ); }
template <class Archive>
static void load_and_construct( Archive & ar, cereal::construct<MyClass> & construct )
{
int xx;
ar( xx );
// note the need to use get_user_data to retrieve user data from the archive
construct( xx, cereal::get_user_data<MyUserData>( ar ).myRawPointer );
}
};
int main()
{
{
MyUserData md;
md.myRawPointer = &something;
md.myReference = someInstanceOfType;
std::ifstream is( "data.xml" );
cereal::UserDataAdapter<MyUserData, cereal::XMLInputArchive> ar( md, is );
std::unique_ptr<MyClass> sc;
ar( sc ); // use as normal
}
return 0;
}
@endcode
@relates get_user_data
@tparam UserData The type to give the archive access to
@tparam Archive The archive to wrap */
template <class UserData, class Archive>
class UserDataAdapter : public Archive
{
public:
//! Construct the archive with some user data struct
/*! This will forward all arguments (other than the user
data) to the wrapped archive type. The UserDataAdapter
can then be used identically to the wrapped archive type
@tparam Args The arguments to pass to the constructor of
the archive. */
template <class ... Args>
UserDataAdapter( UserData & ud, Args && ... args ) :
Archive( std::forward<Args>( args )... ),
userdata( ud )
{ }
private:
//! Overload the rtti function to enable dynamic_cast
void rtti() {}
friend UserData & get_user_data<UserData>( Archive & ar );
UserData & userdata; //!< The actual user data
};
//! Retrieves user data from an archive wrapped by UserDataAdapter
/*! This will attempt to retrieve the user data associated with
some archive wrapped by UserDataAdapter. If this is used on
an archive that is not wrapped, a run-time exception will occur.
@note This feature is experimental and may be altered or removed in a future release. See issue #46.
@note The correct use of this function cannot be enforced at compile
time.
@relates UserDataAdapter
@tparam UserData The data struct contained in the archive
@tparam Archive The archive, which should be wrapped by UserDataAdapter
@param ar The archive
@throws Exception if the archive this is used upon is not wrapped with
UserDataAdapter. */
template <class UserData, class Archive>
UserData & get_user_data( Archive & ar )
{
try
{
return dynamic_cast<UserDataAdapter<UserData, Archive> &>( ar ).userdata;
}
catch( std::bad_cast const & )
{
throw ::cereal::Exception("Attempting to get user data from archive not wrapped in UserDataAdapter");
}
}
#endif // CEREAL_FUTURE_EXPERIMENTAL
} // namespace cereal
#endif // CEREAL_ARCHIVES_ADAPTERS_HPP_
/*! \file binary.hpp
\brief Binary input and output archives */
/*
Copyright (c) 2014, Randolph Voorhies, Shane Grant
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of cereal nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL RANDOLPH VOORHIES OR SHANE GRANT BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifndef CEREAL_ARCHIVES_BINARY_HPP_
#define CEREAL_ARCHIVES_BINARY_HPP_
#include <cereal/cereal.hpp>
#include <sstream>
namespace cereal
{
// ######################################################################
//! An output archive designed to save data in a compact binary representation
/*! This archive outputs data to a stream in an extremely compact binary
representation with as little extra metadata as possible.
This archive does nothing to ensure that the endianness of the saved
and loaded data is the same. If you need to have portability over
architectures with different endianness, use PortableBinaryOutputArchive.