Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Automatic Neural Network Architecture Optimization


Mounir Sourial,  Maggie Moheb
International Max Planck Research School, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Mounir Sourial, M. M. (2019). Automatic Neural Network Architecture Optimization. Master Thesis, Universität des Saarlandes, Saarbrücken.

Cite as: https://hdl.handle.net/21.11116/0000-0005-9C58-9
Deep learning has recently become a very hot topic in Computer Science. It has invaded
many applications in Computer Science achieving exceptional performances compared
to other existing methods. However, neural networks have a strong memory limitation
which is considered to be one of its main challenges. This is why remarkable research
focus is recently directed towards model compression.

This thesis studies a divide-and-conquer approach that transforms an existing trained
neural network into another network with less number of parameters with the target of
decrasing its memory footprint. It takes into account the resulting loss in performance.
It is based on existing layer transformation techniques like Canonical Polyadic (CP) and
SVD affine transformations. Given an artificial neural network, trained on a certain
dataset, an agent optimizes the architecture of the neural network in a bottom-up man-
ner. It cuts the network in sub-networks of length 1. It optimizes each sub-network using
layer transformations. Then it chooses the most- promising sub-networks to construct
sub-networks of length 2. This process is repeated until it constructs an artificial neural
network that covers the functionalities of the original neural network.

This thesis offers an extensive analysis of the proposed approach. We tested this tech-
nique with different known neural network architectures with popular datasets. We
could outperform recent techniques in both the compression rate and network perfor-
mance on LeNet5 with MNIST. We could compress ResNet-20 to 25% of their original
size achieving performance comparable with networks in the literature with double this