Advances in data-aware compressed-indexing schemes for integer and string keys

Name: Advances in data-aware compressed-indexing schemes for integer and string keys
Start: 2023-03-07T00:00:00Z
Location: Dresden, Germany

Giorgio Vinciguerra

Abstract

Compressed-indexing schemes for a collection of keys are the backbone of modern data systems, and their space and query time efficiency is crucial to the system’s performance, particularly under continuously growing volumes of data. We discuss some recent developments of these schemes, beginning with integer keys, and then delving into the more complex case of variable-length string keys. The underlying principle driving these advancements is to take advantage of the regularities and trends in the input data by either integrating learned models, which uncover some new regularities and thus enable more effective compression, or by employing data-aware optimization approaches that orchestrate known encoding schemes for synthesizing the best data structure design. We present experiments that demonstrate the robustness of these new approaches compared to the input-sensitive space-time efficiency of existing solutions. Finally, we conclude by discussing new research opportunities.

Date

March 7, 2023

Event

A Tutorial Workshop on ML for Systems and Systems for ML @ BTW 2023

Location

Dresden, Germany

Giorgio Vinciguerra

Postdoc

Postdoc in Computer Science