Reproducing the sparse Huffman Address Map compression for deep neural networks

Abstract

Deploying large convolutional neural networks (CNNs) on limited-resource devices is still an open challenge in the big data era. To deal with this challenge, a synergistic composition of network compression algorithms and compact storage of the compressed network has been recently presented, substantially preserving model accuracy. The proposed implementation, which we describe in this paper, offers different compression schemes (pruning, two types of weight quantization,and their combinations) and two compact representations: the Huffman Address Map compression (HAM), and its sparse version sHAM. Taken as input a model, trained for a given classification or regression problem (as well as the dataset employed, which is necessary for the fine-tuning of weights after network compression), the procedure returns the corresponding compressed model. Our publicly available implementation provides the source code, two pre-trained CNN models (retrieved from third-party repositories referring to well-established literature), and four datasets. This implementation includes detailed instructions to execute the scripts and reproduce the obtained results, in terms of the figures and tables included in the original paper.

Publication
Proceedings of the 3rd Workshop on Reproducible Research in Pattern Recognition
Avatar
Giosuè Cataldo Marinò
Research contractor
Avatar
Marco Frasca
Assistant professor

Researcher in Machine Learning and AI of the UNIMI unit

Avatar
Dario Malchiodi
Associate professor

Professor of Data analytics and member of the unimi team