Repetition- and linearity-aware rank/select dictionaries


We revisit the fundamental problem of compressing an integer dictionary that supports efficient rank and select operations by exploiting two kinds of regularities arising in real data: repetitiveness and approximate linearity. Our first contribution is a Lempel-Ziv parsing properly enriched to also capture approximate linearity in the data and still be compressed to the $k$th order entropy. Our second contribution is a variant of the block tree structure whose space complexity takes advantage of both repetitiveness and approximate linearity, and results highly competitive in time too. Our third and final contribution is an implementation and experimentation of this last data structure, which achieves new space-time trade-offs compared to known data structures that exploit only one of the two regularities.

Proceedings of the 32nd International Symposium on Algorithms and Computation (ISAAC)
Paolo Ferragina
Full professor

Professor of Algorithms and PI of the project

Giovanni Manzini
Full professor

Professor of Computer Science at the University of Eastern Piedmont

Giorgio Vinciguerra

Postdoc in Computer Science