The ever growing need to efficiently store, retrieve and analyze massive datasets, originated by very different sources, is currently made more complex by the different requirements posed by users and applications. Such a new level of complexity cannot be handled properly by current data structures for big data problems.

To successfully meet these challenges, we propose a new generation of Multicriteria Data Structures and Algorithms. The multicriteria feature refers to the fact that we seamlessly integrate, via a principled optimization approach, modern compressed data structures with new, revolutionary, data structures learned from the input data by using proper machine-learning tools. The goal of the optimization is to select, among a family of properly designed data structures, the one that “best fits” the multiple constraints imposed by its context of use, thus eventually dominating the multitude of trade-offs currently offered by known solutions.

In this project, funded by the Italian Ministry of Education (PRIN no. 2017WR7SHH), we will lay down the theoretical and algorithmic-engineering foundations of this novel research area, which has the potential of supporting innovative data-analysis tools and data-intensive applications.

# What is a multicriteria data structure?

A multicriteria data structure, for a given problem $P$, is defined by a pair $\langle \mathcal F, \mathcal A \rangle_P$ where $\mathcal F$ is a family of data structures, each one solving $P$ with a proper trade-off in the use of some resources (e.g. time, space, energy), and $\mathcal A$ is an optimisation algorithm that selects in $\mathcal F$ the data structure that best fits an instance of $P$.

For more details on the project, please have a look at its full description here.

# Publications

Quickly discover relevant content by filtering publications.

# Projects

#### PGM-index

A data structure enabling fast searches in arrays of billions of items using orders of magnitude less space than traditional indexes.

# Talks

### DNA combinatorial messages and Epigenomics: The case of chromatin organization and nucleosome occupancy in eukaryotic genomes

Epigenomics is the study of modifications on the genetic material of a cell that do not depend on changes in the DNA sequence, since …

### Hybrid Data Structures and beyond

The ever growing need to efficiently store, retrieve and analyze massive datasets, originated by very different sources, is currently …

# Events

Sep 2021

#### Second meeting

Feb 2020 Dept. of Computer Science, Via Largo Bruno Pontecorvo 3, Pisa

#### Kickoff meeting

Sep 2019 Video conference
Minutes of the meeting

Sep 2019