English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Thesis

Automatic Neural Network Architecture Optimization

MPS-Authors

Mounir Sourial,  Maggie Moheb
International Max Planck Research School, MPI for Informatics, Max Planck Society;

External Ressource
No external resources are shared
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Mounir Sourial, M. M. (2019). Automatic Neural Network Architecture Optimization. Master Thesis, Universität des Saarlandes, Saarbrücken.


Cite as: http://hdl.handle.net/21.11116/0000-0005-9C58-9
Abstract
Deep learning has recently become a very hot topic in Computer Science. It has invaded many applications in Computer Science achieving exceptional performances compared to other existing methods. However, neural networks have a strong memory limitation which is considered to be one of its main challenges. This is why remarkable research focus is recently directed towards model compression. This thesis studies a divide-and-conquer approach that transforms an existing trained neural network into another network with less number of parameters with the target of decrasing its memory footprint. It takes into account the resulting loss in performance. It is based on existing layer transformation techniques like Canonical Polyadic (CP) and SVD affine transformations. Given an artificial neural network, trained on a certain dataset, an agent optimizes the architecture of the neural network in a bottom-up man- ner. It cuts the network in sub-networks of length 1. It optimizes each sub-network using layer transformations. Then it chooses the most- promising sub-networks to construct sub-networks of length 2. This process is repeated until it constructs an artificial neural network that covers the functionalities of the original neural network. This thesis offers an extensive analysis of the proposed approach. We tested this tech- nique with different known neural network architectures with popular datasets. We could outperform recent techniques in both the compression rate and network perfor- mance on LeNet5 with MNIST. We could compress ResNet-20 to 25% of their original size achieving performance comparable with networks in the literature with double this size.